Deploying Jekyll with Bitbucket Pipelines

2018/04/07

The technology behind this blog is in a permanent flux. It's my primary playground to try out new stuff. As you might know, this Jekyll generates this blog. It is a static site generator which takes my posts (written as Markdown files) and generates the HTML pages you're looking at right now. To be more specific: The source files are stored on Bitbucket.org and a server at Hetzner serves the HTML files. When changes are made in Bitbucket, it would trigger the Jekyll setup on the server to publish everything straight away.

Some time ago, Bitbucket.org introduced Pipelines. A feature which gives you the ability to run build and deployment scripts on Bitbucket itself. Curious on how much of a continuous deployment pipeline I could create, I decided to give it a try. To move Jekyll from running on my server to let Bitbucket take care of it. This post details the process and what I came up with and some general thoughts on this feature-set of Bitbucket.

Building Jekyll

To use the pipelines-feature, you have to add a bitbucket-pipelines.yml file to the root directory of the repository. For development I recommend using Bitbucket's Validator for Pipelines which checks at least the syntactical correctness of a bitbucket-pipelines.yml.

I quickly got it to install the dependencies and generate a Jekyll blog. To get things started here the basic build setup:

#bitbucket-pipelines.yml
image: ruby:2.5

pipelines:
  default:
    - step:
        name: Generate the blog with Jekyll
        caches:
          - bundler
        script:
          - bundle install --path vendor/bundle
          - bundle exec jekyll build --destination public
definitions:
  caches:
    bundler: vendor/bundle

In general the Pipelines are nothing else but a script that spins up a Docker container (specified by the image parameter), and executes the steps you've given it within this container. This happens all isolated and you can see the output of each step in the Pipelines section of your Bitbucket repository.

For this example I use version 2.5 of the Ruby Docker image to install the dependencies and to run jekyll build afterwards. As specified by the --destination parameter this creates the HTML pages in the directory public/.

A particular Pipelines-related feature in this configuration is the cache definition. By defining a path as a cache, Bitbucket will store this cache after execution and make it available at the same location when executing the pipeline again. Using the --path parameter on the bundle install command we ensure that Bundler saves all dependencies in that directory. This is particularly helpful to reduce the time the build takes. Since every second of execution time costs, this helps to keep the bill low.

Persisting the Build result

So far we build the Jekyll blog but it's gone as soon as the pipeline process finishes. Let's make use of another Bitbucket feature: Artifacts. Bitbucket will store everything that's within the Artifact folder and make it available to any following steps defined in the bitbucket-pipelines.yml configuration. Unlike caches, Artifacts only exist in the context of a single Pipeline execution.

image: ruby:2.5

pipelines:
  default:
    - step:
        name: Generate the blog with Jekyll
        caches:
          - bundler
        script:
          - bundle install --path vendor/bundle
          - bundle exec jekyll build --destination public
          - mkdir dist
          - tar -czvf dist/package-${BITBUCKET_BUILD_NUMBER}.tar.gz -C public .
        artifacts:
          - dist/**

What's happening here? For once the directory dist is created where we want to store the artifact. We also tell Bitbucket speficially to take everything that's in this directory to persist it (this is done recursively by the artifacts: dist/** definition). To make things a bit faster, we will pack everything which is in the public directory as a tar-archive with the filename dist-$BUILDNUMBER.tar.gz, e.g. dist-47.tar.gz if that was 47th execution of the build.

Preparing the Deployment

So we have the generated files from Jekyll, stored somewhere in Bitbucket. Time to make those public and deploy them to the server. There is an additional area in your Bitbucket project, called Deployments. Here you will be able to see the result of any deployment, but the configuration also happens within the bitbucket-pipelines.yml file. Deployment steps are identified by the key deployment:

image: ruby:2.5

pipelines:
  default:
    - step:
        name: Generate the blog with Jekyll
        # Omitted the stuff from above for the sake of brevity
    - step:
        name: Deploy to Web
        image: alpine
        deployment: production
        trigger: manual
        script:
          - tar -xf dist/package-${BITBUCKET_BUILD_NUMBER}.tar.gz -C upload

We're reversing the actions performed in the build phase: We take the artifact and extract it to the location upload/.

The trigger: manual option means that we have to click Run on the deployment step in the Bitbucket frontend. If you want it to happen automatically, remove this line.

Also since we no longer need Jekyll and Ruby, we use a slimmer Docker image, Alpine.

Deploy All The Things!

This section might be heavily depend on your particular server setup. As a common ground you should have SSH access to the server and be able to write into the folder of the webroot.

To get these files onto the server, we will use a SSH connection. Please make sure you enabled the SSH communication from the Bitbucket Piepline following these instructions: Use SSH keys in Bitbucket Pipelines.

Once we can use the SSH connection in our pipeline, we will upload the files using rsync into a fresh directory next to the webroot (in this example /var/www/html-${BITBUCKET_BUILD_NUMBER}), only after the transfer is complete, we will replace the webroot (/var/www/html) with our new folder.

To use SSH and rsync they must first be installed in the container. To get the configuration below to work, the my-user and my-host placeholders must be replaced with the actual information.

image: ruby:2.5

pipelines:
  default:
    - step:
        name: Generate the blog with Jekyll
        caches:
          - bundler
        script:
          - bundle install --path vendor/bundle
          - bundle exec jekyll build
          - mkdir dist
          - tar -czvf dist/package-${BITBUCKET_BUILD_NUMBER}.tar.gz -C public .
        artifacts:
          - dist/**
    - step:
        name: Deploy to Web
        image: alpine
        deployment: production
        trigger: manual
        script:
          - tar -xf dist/package-${BITBUCKET_BUILD_NUMBER}.tar.gz -C upload
          - apk update && apk add openssh rsync
          - rsync -a  -e "ssh -o StrictHostKeyChecking=no" --delete upload/ my-user@my-host:/var/www/html-${BITBUCKET_BUILD_NUMBER}
          - ssh -o StrictHostKeyChecking=no my-user@my-host "rm -r /var/www/html"
          - ssh -o StrictHostKeyChecking=no my-user@my-host"mv '/var/www/html-${BITBUCKET_BUILD_NUMBER}' '/var/www/html'"
definitions:
  caches:
    bundler: vendor/bundle

Tada! Now once you click run on the Deploy to Web step of your build, it will push the HTML pages to the webserver and make them available to the public.

Closing Thoughts

My process to reach the final setup span over multiple weeks. Tweaking a bit here and there and discovering features late in the process. For example I only learned about the artifact feature when writing this article. Originally I used the Downloads feature of a Bitbucket repository to save the results between build-steps. So here are a few points on how I currently use this setup and on the deployment process in general.

My config is still very straight forward. It will build on every branch on every push. At the same time it requires the manual click on "Run" to execute the deployment. For myself I switched it to only run on the master branch — but then with an automatic deployment. So I can publish a blogpost without having to log into Bitbucket.
As far as I can see, any step will always checkout the Git repository. Seems unnecessary for the deployment step where I don't use it. For larger repositories it's
It's important to look at the execution time a single build takes and how often it's executed. After all the Pipelines are build per second and the free tier comes with 50 minutes per month.
A comparison with Jenkins or Bamboo isn't completely fair of course. But for a lot of small projects I'm involved with the scope of these Pipelines are senough. I still have to look into the Gitlab CI/CD approach.
At the same time the discrepancy between Bitbucket.org and the Bitbucket Server software bothers me. I understand the historical background, but it makes it hard to find the right resources for either product.
Atlassian should bring the secured flag for environment variables to Bamboo as well. Bamboo currently relies on naming the env vars something like MYVAR_PASSWORD to secure them.