How this blog works

This blog is based on Jekyll. One writes posts in markdown and Jekyll generates static HTML pages out of them. Upload the HTML files to some web hoster and the blog is ready to read.

A few days ago, a friend of mine mentioned that spring.io is also using Jekyll. The blogs are stored in a git repository and whenever a push is done, Jekyll runs and the new version is published. Cool thing. I wanted to have the same.

The easiest solution would be using github pages which support Jekyll. But there are two reasons I didn’t want to use it:

Github pages do not support Jekyll plugins.
More important: I had finally a reason to play with Amazon Web Services.

Storage

First thing we need is a place to store the content: A Simple Storage Service (S3). Create a new bucket, enable website hosting and allow anonymous access with this bucket policy.

{
    "Version": "2008-10-17",
    "Id": "Policy1419299145082",
    "Statement": [
        {
            "Sid": "Stmt1419299140783",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::<your-bucket-name>/*"
        }
    ]
}

And the files are accessible to the web through the static website endpoint. This is the minimal working solution. But there’s a lot of improvements.

Performance

Once the blog is generated, it consists entirely of static files. This allows the usage of a Content Delivery Network. The idea is to distribute copies of the files on servers around the world. Every request for a file will be routed to the nearest server for maximum performance.

Amazon CloudFront is such a CDN. Create a new web distribution and set the origin domain name to the endpoint (not just the bucket name) of the S3 bucket. In our domain’s DNS, add a CNAME pointer to the domain name of the CloudFront distribution. Also add it to the CNAMEs attribute of the distribution. After some minutes, the distribution is ready and accessible through the configured subdomain.

Autodeploy

Instead of running Jekyll on my machine and then uploading the files, I want this to be done automatically. The idea is to add a webhook on the github repository which calls a server that does the necessary steps to publish the changes.

So we need a server. In the amazon world, this is called Elastic Compute Cloud (EC2). Create a new instance with your favorite flavor of Linux. If needed, install git, ruby, jekyll, apache, php and amazon cli on it.

To publish the files whenever a push has been done, I wrote a little PHP script (https://github.com/nidi3/jekyll-hook). Install it to the EC2 server. In the blog’s github repository, add a webhook pointing to the URL http://<ec2-server-ip>/publish-github.php with content type application/x-www-form-urlencoded. The script does three things:

Use git to clone or pull the github repository.
Call Jeykll to create the HTML files.
Use the aws command line tools to synchronize the created files with the files in the S3 bucket.

Now, a few seconds after a push to the github repository, the changes are available on my site.

Server Control

Everything works nicely. But we have a server running all the time that we really only need once in a while to publish new pages. So I wrote another PHP script that can start a EC2 server, send a publish request to it and stop it. Naturally, this script needs to run on a server itself, but it can be any cheap server supporting only PHP. Just change the webhook URL to http://<cheap-server-ip>/master-publish-github.php?start&stop.

This is probably overkill for just a personal blog. But it was interesting playing with Amazon Web Services and seeing how far I could push things.

Part II of how this blog is done.