Julian Bonilla

Octopress on Amazon S3

Octopress is a static site generator based on Jekyll. You write your posts in Markdown and Octopress generates the HTML for your site. This simplifies your hosting requirements. No application server or database required.

Octopress has great documentation. The docs got me 95% of the way there, but left some unanswered questions. This guide answers the questions I had on migrating and hosting Octopress. I refer to the official docs where it makes sense, to avoid repetition.

There are lots of options for hosting your blog: GitHub Pages, Amazon S3, Heroku (or other PaaS), or a virtual private server. I evaluated all the hosting options and decided on Amazon S3. GitHub Pages doesn’t support redirects at the moment. I originally overlooked S3, thinking they didn’t support redirects either, but in fact they do! Heroku is a good (and free) option, but I didn’t want to run a Rack or WSGI process to handle redirects. My goal was simplicity, so that ruled out managing a VPS.

Git & Ruby

I’m using Homebrew for package management.

Install GitGit Documentation
1
brew install git

Octopress requires Ruby. I followed the Octopress guide for installing Ruby with RVM. I’m not a Rubyist, and haven’t really compared rbenv to rvm. They both seem perfectly good for this exercise.

Fork Octopress

I’m hosting my fork of Octopress on Bitbucket. GitHub offers 5 private repositories and unlimited collaborators with their micro plan. I prefer to save those slots for projects with collaborators. Bitbucket’s model is reversed: 5 collaborators and unlimited private repositories.

There are a couple of advantages to keeping your fork private. Many of the plugins, such as Google Analytics, require setting an API key in the Octopress _config.yml. Octopress also supports a published flag, which allows you to keep work-in-progress posts that don’t show up when the site is generated. Neither of these belong in a public repository.

You can fork the Octopress repository hosted on GitHub directly to Bitbucket, using the import repository feature on bitbucket.org. Then clone your fork.

Clone your Octopress forkGit Documentation
1
git clone git@bitbucket.org:YOUR_USERNAME/octopress.git

At this point Bitbucket is the origin remote.

Git remotesGit Documentation
1
2
3
git remote -v
origin    ssh://git@bitbucket.org/YOUR_USERNAME/octopress.git (fetch)
origin    ssh://git@bitbucket.org/YOUR_USERNAME/octopress.git (push)

Now is a good time to add a remote to the official Octopress repository, so you can get updates.

Add Octopress remoteGit Documentation
1
2
3
4
5
6
git remote add octopress https://github.com/imathis/octopress
git remote -v
octopress https://github.com/imathis/octopress.git (fetch)
octopress https://github.com/imathis/octopress.git (push)
origin    ssh://git@bitbucket.org/YOUR_USERNAME/octopress.git (fetch)
origin    ssh://git@bitbucket.org/YOUR_USERNAME/octopress.git (push)

My setup looks like this:

Next up is configuring Octopress. I just followed the configuration steps and had an empty blog running in no time. Now I needed to migrate my WordPress blogs to Octopress.

WordPress migration

Two utilites make the WordPress migration easy: exitwp and wp2oct-links. exitwp converts an entire WordPress site to Markdown. wp2oct-links generates the rewrite rules for permanently redirecting old URLs to new ones. Both are python utilites. I highly recommend The Hitchhiker’s Guide to Python for getting setup with Python.

WordPress has an exporter in the admin panel. Use that to export any posts and pages you want to migrate. The exporter generates an XML file which we’ll feed to exitwp. Then copy the result of exitwp to your Octopress directory.

WordPress to Markdownexitwp documentation
1
2
3
4
5
6
7
git clone https://github.com/thomasf/exitwp
cp wordpress-export.xml exitwp/wordpress-xml/
cd exitwp
virtualenv --distribute venv
source venv/bin/activate
pip install --upgrade -r pip_requirements.txt
python exitwp.py

I should note that preserving your URL scheme in Octopress removes the need for redirects. The WordPress URL scheme is /year/month/day/post-title/. The default Octopress URL scheme is /blog/year/month/day/post-title/. Simply configuring the Octopress URL scheme to match WordPress will save you the trouble of dealing with redirects.

Otherwise use wp2oct-links to generate the rewrite rules, and set those aside for later. We’ll use those rules when setting up Amazon S3.

Rewrite ruleswp2oct-links documentation
1
2
3
4
5
6
git clone https://github.com/Dotnetwill/wp2oct-links
cd wp2oct-links
virtualenv --distribute venv
source venv/bin/activate
pip install --upgrade -r requirements.txt
python wp2oct-link.py /path/to/wordpress-export.xml

Amazon S3

At this point all your posts and pages have been transfered to Octopress. Maybe you’ve even written a new post. Now it’s time to publish the site. Amazon S3 offers static website hosting and redirects. The Amazon S3 Developer Guide has a step-by-step tutorial on hosting websites on Amazon S3.

Every object stored in Amazon S3 is contained in a bucket. Buckets partition the namespace of objects stored in Amazon S3 at the top level. Within a bucket, you can use any names for your objects, but bucket names must be unique across all of Amazon S3.

You can map a CNAME on your domain to an S3 bucket. My bucket name is www.julianbonilla.com. The bucket is reachable on S3 at www.julianbonilla.com.s3-website-us-east-1.amazonaws.com.

The AWS console has a tool for uploading your site to S3. There are also plenty of graphical and command line clients. I settled on S3tools for transfering my files.

Install S3ToolsS3tools Documentation
1
2
brew install s3cmd
s3cmd --configure

Now generate your site and upload to S3.

Sync with S3S3tools Documentation
1
2
3
cd your_octopress_directory
rake generate
s3cmd sync --acl-public --reduced-redundancy public/* s3://www.julianbonilla.com/

Each S3 object has an access control list. Objects are private by default, the --acl-public flag sets your site to public (readable by anyone). Amazon offers two storage tiers: standard and reduduced redundancy storage (RRS). RRS is cheaper than standard storage ($0.093/GB/month vs $0.125/GB/month). But RRS does not replicate objects as many times as standard storage. I use the --reduced-redundancy flag since I can generate the site from source.

Redirects

The Amazon Developer Guide has a detailed explanation on configuring a web page redirect. For each of the rewrite rules generated with wp2oct-links, I created a zero-byte object with a matching key in S3. You can script this or create them directly in AWS console. Then you set the x-amz-website-redirect-location metadata for each object to point to the new object in your bucket. Amazon S3 will respond with a 301 redirect.

Create redirect objects
1
2
3
4
5
6
cd your_octopress_directory/public

for path in path1 path2 path3; do
  mkdir -p $path
  touch ${path}/index.html
done

The last step is pointing DNS to your S3 bucket. Make sure everything is working on s3-website-us-east-1.amazonaws.com. Then create a CNAME that matches your bucket name. I use Hover to manage my DNS.

Flip the switch
1
CNAME www.julianbonilla.com -> www.julianbonilla.com.s3-website-us-east-1.amazonaws.com.

Notes

Here are some useful tips I collected.

Notes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Create a new post
rake new_post["title"]

# Create a new page
rake new_page[super-awesome]
rake new_page[super-awesome/page.html]

# Generate blog
rake generate

# Preview blog
rake preview

# Push to origin on Bitbucket
git push origin master

# Pull from octopress on Github
git pull octopress master

# Deploy to Amazon S3
s3cmd sync --acl-public --reduced-redundancy public/* s3://www.example.com/

Goodbye Joyent, Hello Amazon

Joyent finally pulled the plug on TextDrive customers. I’ve been a long time customer of Joyent. I have to say, the original TextDrive had awesome customer service. I had never interacted with Joyent until earlier this year, when doing a routine WordPress upgrade, I discovered they had stopped upgrading PHP on my server.

I was stuck with an outdated WordPress instance and no way to apply security patches. I asked them to update PHP on my box, and got an email pointing me to a convoluted wiki about migrating to their new platform.

In August I got another email saying they were sunsetting the service on October 31, 2012. Then they changed their minds, and decieded to bring back TextDrive as a new company. By this time I decided I was moving on. So in keeping with tradition, here’s my guide on migrating from Wordpress to Octopress and hosting on Amazon S3.

Emacs Shell Scrolling

I always have several bash shells running in Emacs (M-x shell). The default scrolling behavior drives me crazy.

If comint-scroll-show-maximum-output is non-nil, then arrival of output when point is at the end tries to scroll the last line of text to the bottom line of the window, showing as much useful text as possible. (This mimics the scrolling behavior of most terminals.) The default is set to True.

I don’t like chasing the output of shell commands to the bottom of the window.  You can add this to your  .emacs to keep the window from scrolling.

;; Don't scroll to bottom for shell output
(setq comint-scroll-show-maximum-output nil)

Finding Python Site-packages

Quick way to locate the Python site-packages directory:

python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"

Most likely here on Mac OS X:

/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages

Really Big Numbers

I always forget the names of really big numbers. What do you call 1,000,000,000,000,000? It’s one quadrillion. What about 1,000,000,000,000,000,000,000,000,000,000,000,000? One undecillion.

Checkout this cool directive to the format function in Common Lisp.

(format nil "~r" 1606938044258990275541962092)

Which produces this string:

one octillion six hundred six septillion nine hundred thirty-eight sextillion forty-four quintillion two hundred fifty-eight quadrillion nine hundred ninety trillion two hundred seventy-five billion five hundred forty-one million nine hundred sixty-two thousand ninety-two.