Evolving My Website

The revolutionary approach to design is more romantic, but usually the evolutionary approach is more practical. That's the approach I'm taking for my little bloglet thing here, for the most part, and it's working out pretty well. Building gradually from the baseline to a more fully featured system is both easier and more motivating, as you can see the results of your work as you go.

Unfortunately the main problem with the evolutionary approach is that sometimes it's impossible. This shouldn't be too surprising; evolving organisms have complicated spaghetti-code DNA, and are fine with growing through trial-and-error over millions of years, whereas software projects are expected to be well-organized and easily-grokkable, and ought to function well soon and perpetually.

The evolutionary approach mandates that you start out with a structure that's quick to set up and works well for a small program. When the program outgrows its small ad-hoc structure, you generally have to do a massive refactor (a revolution) to get it into a new order. In my experience, refactors are rarely because of a mistake in the original structure. They're usually just because earlier the program had to be one way and now it has to be another way.

That is the reason for the recent refactor of my current website. In the interest of slapping together a place to post things with one weekend of hacking, I went for a quick and thoughtless structure so I could spend most of the time on the build script and the templating engine. It contained two main directories: one for the final published site, public_html, and one for the source files, public_html/src. The make.pl just copied all the built files directly down into public_html. To work on the website, I logged in remotely and worked over ssh.

Eventually I added a git repository, because I was getting uncomfortable working without version control. I have vim configured to save undo history between sessions, but that isn't enough. It just feels really good to write commit messages.

This week, my ssh sessions started getting hiccuppy, so I wanted to be able to write posts and program on my laptop. Furthermore, my build script was getting a little rickety, and updating the site by copying files to public_html felt kinda wrong, because I though the updates should be as atomic as possible. It was time for a reorganization. After some thought, this is the structure I came up with.

website/ is the main git repository, which is mirrored between my laptop and home.
website/src/ contains all the source files.
website/build/ is the first place the full website is built into. I have a link from /var/www/~lewis/ to here on my laptop, so that I can test the site locally with the same URL structure as it has at home.
website/tool/ contains perl modules and stuff.
website/public_html_0/ and website/public_html_1/ alternately contain the real website data, copied verbatim from website/build.
public_html is a symbolic link pointing to either website/public_html_0 or website/public_html_1. This way, I can have one copy of the website being built while the symlink is pointing to the old copy. As soon as the new copy is finished, I have this symlink updated to point to the new copy. As a bonus, if something goes horribly wrong I can manually link this back to the old copy while I put out the fire.

There were two extra points to consider during this plan. The first was that I had a bunch of random files I've uploaded to share in various places, and I didn't want to change any of their URLs. So I stuffed them into a subdirectory of website/ and had make.pl create symlinks to them in website/build/. The other was that changing a symlink in place is not guaranteed to be an atomic operation on UNIX systems according to this article. The same article provides a solution though, which is to create the symlink somewhere else and move it onto the target (thus replacing it). If you overwrite the link manually, you run the risk of a guest randomly getting a 404 for requesting a page at exactly the wrong time.

Finally, once I implemented this structure, I decided that I would really like to be able to to publish updates to my site directly from my laptop. I thought this would be difficult in various respects, but it turned out that both pushing git over ssh and setting up git hooks that run when I push were both really easy. I learned how to do this from a blog post and a Stack Overflow question, though I didn't make a bare repository like most people were suggesting, because I want the files I push to actually be there after I push (which requires calling "git checkout -f" in the post-receive hook).