Exit, WordPress

I moved this blog out of WordPress. It’s now a static site generated with Hugo. Here’s what I did.

The old setup

Previously, this blog was hosted on DreamHost (both the domain registration and hosting). Due to the remarkable quirk of me being a cheapskate while I was at uni, I bought the domain myself but one of Paris or Jon (I can’t remember who now) did the hosting, because they had hosting. DreamHost was the domain registrar and provided the box that the A record pointed to.

DreamHost makes this easy…too easy. To host content for a domain on DreamHost, if you have purchased hosting in some form, all you have to do is enter the domain name, and be the first person to do that. Then they can set it all up. They get the options all the way down to and including the DNS settings, but not to renew or cancel the domain registration (unless they registered the domain too). If someone else has a hosting account (say, the person who registered the domain) and they want to take over hosting, one of two things has to happen:

  1. The person who is currently hosting has to remove the hosting in their account, or,
  2. The person who registered the domain files a ticket with DreamHost support to get the hosting cut over to their account.

So under the old setup, I registered the domain, and one of my friends kindly and graciously provided the hosting, which is the more expensive part, and for that I’m really grateful.

The hosting itself was a typical WordPress setup. The server ran Linux, the web server was Apache with PHP 5, and MySQL was the database backend.

Benefits of this old setup include:

  • Standard! So freaking standard that helpful articles just oozes out of the internets.
  • WordPress is a dynamic blogging platform, so you can get pretty complicated with content generated out of the database.
  • WordPress has themes, and auto resizes photos, and and and…
  • WordPress supports plugins. I have various bits of math floating around and having a plugin that renders LaTeX in actual symbols is really nice!
  • Comments. I no longer consider this a benefit, but it was nice for the first 5 years.
  • WordPress has mobile apps that you can use to edit your blog. I only did this a couple of times but it was a cute touch.

Problems with the old setup

Problem #1: Comment spam. A mere 4 months after my first blog post, I wrote a blog post about spammers.

Until I migrated off WordPress yesterday, I used Akismet to do automated spam filtering. This requires setting up a WordPress account and getting an API key, then you put the API key into the Akismet WordPress plugin and you’re set. Every now and again you review some spam comments but the volume of spam is greatly less.

But nobody really commented much on my blog. For that reason, and also because there exist people I don’t to hear from, I disabled the comment form on posts and pages. Sadly this wasn’t enough. For some reason I don’t care to understand (because the old way is dead now), comments would still appear in the moderation queue.

Problem #2: Security vulnerabilities. Old PHP had ‘em. Old WordPress had ‘em. WordPress plugins get ‘em. The database password was crap. The attack surface of the old WordPress blog is pretty big, but the value of the target was small. (My blog isn’t that interesting.) This is kind of a deal with any dynamically-generated content, compared with static sites.

I know that there were vulnerabilities and they were abused. I had shell access to the server, and a few times found various PHP files that DreamHost automatically blocked either by setting the file perms to 0000, or moving them into “.INFECTED” files. Just how badly pwnd my old blog was in the end, I’ll never be sure. But it was pwnd.

Security means security updates. It was a chore signing in to click the “Update everything” button. It’s more of a chore doing the recommended file and database backup before WordPress updates. It’s not a large burden, but it’s a burden, and my feelings on automating and getting rid of toil like this blossomed as a result of working for Google now.

Problem #3: The hosting situation. Because I’m a Googler now, I don’t want to be a cheapskate. I have the privilege of having a good income. I’m sure my friends get good feelings from being generous. Again, thank you guys. It was greatly appreciated. However! I can host things myself now.

The new setup

The new setup works like this.

  • Domain registrar: DreamHost (still).
  • Hosting option: Redirect (HTTP 301 redirect) joshdeprez.com to www.joshdeprez.com, on my own DreamHost account.
  • Custom DNS option: www (in the joshdeprez.com zone) is a CNAME for c.storage.googleapis.com. So the files are hosted by Google Cloud Storage.
  • The files for the site are generated with Hugo.
  • The source code for the site is written in Markdown, which Hugo converts into HTML according to various Go HTML templates and layout files.

The benefits of this approach:

  • It’s my own DreamHost account, and Google Cloud account.
  • It’s static, so the content is highly cacheable and can be served from Google’s CDN.
  • It’s static, so there’s no database or PHP.
  • It’s static, so there’s no comments at all. I could use a plugin like Disqus later on if I decide I really want comments (I really do not want comments).
  • It’s static, so entire classes of web vulnerabilities don’t happen.
  • It’s static, so I compose content offline, use git to version control the whole thing, and upload when I’m happy with it.

The drawbacks:

  • The 301 redirect to www is an annoying but necessary part of using a CNAME to use transparent hosting on a different domain. Why can’t I point the A record at Google Cloud Storage? Because GCS uses DNS load balancing, and the IP address (the A record target) would change depending on location, load, the phase of the moon, etc.

Erm. That’s about it for drawbacks, actually.

Migration

Migration was a bit of a pain, but I prevailed.

There exists a WordPress to Hugo Exporter. It is a WordPress plugin. You push the button and then you download a zip file containing static content and all your pages and posts helpfully converted to Markdown for you. I used this.

What I was unable to do was run it on my live blog. I tried and it failed. When you push the Export button, it scans the site, building an archive in /tmp on the server. The content for my old blog was over 1 GB, but /tmp didn’t have that much space, so it crashed and failed.

I solved this problem by:

  1. Running up Debian 8 in a VM at home, giving it stacks of disk and RAM;
  2. Installed the standard LAMP stack from the Debian repos;
  3. Packed all the WordPress files from my live site, and a full database dump, into a tarball that I downloaded;
  4. Unpacked the tarball in my Debian VM;
  5. Reconfigured WordPress in the VM enough to get it working;
  6. Ran the exporter locally.

The exported files worked pretty well in Hugo, but I wanted to make it really sing. So began the editing process.

Most importantly was the look and feel. There is a Twenty Fourteen theme for Hugo, which is an adaptation of the WordPress theme of the same name. It’s the theme I used on my old blog, and it’s kind of nice, so I kept it for the new blog.

It was easy enough to implement the theme (git clone the theme into the themes directory). Some adjustments later (such as the summary view), and it was pretty good.

Most of the editing effort was reorganising the content. I wanted to migrate off the wp-content/uploads structure that was unhelpfully preserved by the exporter. So I went through all the posts one by one and found the bits that were actually used, and moved them into one of two places:

  1. For “galleries” of photos that I dumped at the end of blog posts, I uploaded them to Google Photos and pasted the share-link.
  2. For photos interspersed with text in blog posts, I copied them into /static/$postnumber and then rewrote the autogenerated <figure> tags with the figure shortcodes provided by the theme.

Rewriting <figure> to shortcodes and fixing the paths got really boring, really fast. So I gave up and wrote a short Go program to did it for me.

Much tweaking later (deleting autogenerated HTML weirdness, replacing formatting with Markdown equivalents, tidying filenames, fixing brokenness in the theme layouts…), I was happy with the output from Hugo (what you see on this site now!) so I:

  • filed a DreamHost ticket to move hosting to my account,
  • gsutil -m rsync-ed the site up to the GCS bucket,
  • set the hosting options with the redirect and custom CNAME,
  • SSH’d into the old host and deleted all the old site files and dropped the database (local backup just in case!) and…

Everything worked!

Time for dinner!