Cloud computing has been around for a while, but only recently have we, the general populace, had access to it.
Amazon offer one such manifestation: an environment where it's possible to set up any number of virtual dedicated servers and use them for hosting, in our case,
Drupal sites! I wanted to share some of my experiences (both good and bad) using the Amazon cloud, so you can make a better decision about whether it's right for your
Drupal sites.
This is attractive compared to paying for a shared hosting account. One can never quite be sure what else is running on the system you're sharing, leading to potential performance woes. The Amazon EC2 (Elastic Compute Cloud), as they call it, appears more attractive than smaller companies offering VPSs (Virtual Private Servers) because of the sheer scale of Amazon. It is unlikely to disappear tomorrow and is backed up by the excellent S3 (Simple Storage Service) for backing up data.
Having said that, Amazon solutions can be costly. At the time of writing, Amazon's most modest offering, a reasonably specced dedicated machine with 1.8GB of memory, goes for 11 cents ($0.11) per hour. This doesn't sound like much, but there are 168 hours in a week, and based on the 4.3-week month, that's 722 hours per month, or about $80 per month. Writing from the UK, with the exchange rates as they are at the time of writing, this is about £55.
Still, the alternative for us was to purchase a new machine and buy some rack space (which is billable in advance, often for several months). Compared to Amazon, who bill for usage at the end of each calendar month, with no initial hardware cost, the choice seemed clear.
Starting out, it's striking how little documentation there is. Concepts like elastic IPs, keypairs and elastic block stores are very alien, even to the average techie, and whilst there is introductory material, it feels incomplete. Since the Amazon system is fairly new, and is quite pioneering in its approach, this is understandable, but doesn't make the task any easier.
One of the biggest surprises is that, at the time of writing, Amazon's own web interface does not allow the management of EU-based instances (virtual machines), despite allowing control over US-based ones. If there's one thing that really winds people up on this side of the Atlantic, it's that Americans (and often Canadians) are given preferential treatment with this type of thing. Nevertheless, we were soon able to locate the excellent
ElasticFox firefox extension, which allows management of European instances, and we had our very own fresh copy of Ubuntu installed and running at the touch of a button.
This is incredibly powerful stuff, especially because in theory, you can launch as many virtual servers as you like with the click of a button. In practice, Amazon impose sensible limits (although you can apply for more if you genuinely need them) and after all, you're paying per hour for all these machines. We found
Alestic's site very valuable indeed. It allows you to quickly find the right Amazon Machine Image (AMI) to use when starting your system, rather than poring over a huge list. A lot of these systems come pre-installed with all the things you're likely to need. There weren't any with specific Drupal installs, but since this is only a 5-minute job (download, unzip), it wasn't too much of an issue.
When you turn off (terminate) an instance, or the power fails at Amazon (very rare but can happen), the instance disappears completely, and so does its local storage. This is a concept alien to those unaccustomed to a VPS environment. After all, if I turn off my laptop right now, the data will be saved to the hard disk and when I start it up again, it's all there. This is not so with Amazon's hosting. The virtual machine powers down and all data stored locally, including program setting and even the operating system itself, is gone forever.
With this in mind, it's clear that a backup solution is needed. Luckily, there are some decent tools in the
Amazon EC2 AMI tools package, which is pre-installed on many of Alestic's images. The idea is that you regularly take an image of the entire machine and copy it to S3 where it can be stored safely, and permanently.
Writing a simple script to do this proved more difficult, however. Firstly, it wasn't clear from the documentation that we needed to explicitly state that we wanted to back up the data to an EU S3 bucket. Without this option, the files were sent to a US bucket, taking a very long time indeed and costing $0.17 per gigabyte (a machine image is usually at least 1GB, if not more). Secondly, the bundling process, as Amazon calls it, is less than reliable. Sometimes it would just bail for no apparent reason. Sometimes it would bundle the image and fail during the upload process, again, for no apparent reason. I personally still don't trust the automated backup script I wrote because of these shortcomings, so I find myself checking manually a lot of the time, which diminishes the value of an automated system.
The next concept that was slightly alien was the Elastic Block Store (EBS), which is a system whereby it's possible to create virtual hard disks and mount them to your instances. This is much better than storing files on the instance itself, because if the instance dies, your data are safe. It's possible to take what Amazon calls snapshots of the volumes, enabling a simple backup system, and again, this process can be automated, but you will need to know your way around a shell script, since this is not a point-and-click affair.
The EBS makes it easy to split data into different volumes (database volume, websites volume, miscellaneous volume, etc). Initially we wanted to run MySQL and Apache on the same low-traffic system to see how good Amazon really was, but we always wanted the ability to migrate MySQL to a dedicated machine at a later date. It's a doddle with Amazon: you can simply unmount the database EBS volume from one machine and mount it to another.
We have used the EBS to store some of our more persistant configuration settings too, such as Apache configuration, Apache log files and configurations for the awesome
Nagios monitoring system.
To administer these Amazon systems, shell access is needed at a minimum. Unlike other systems where it's possible to simply connect on port 22, Amazon uses a keypair system. Each instance must be created with a specific keypair, and then a key must be downloaded and used with the terminal application (such as
PuTTY) before it will allow you to connect. Terminal is nice and all, but sometimes it's useful to do more with the system, like have multiple terminals open or use a GUI tool (like
MySQL Administrator). For this, we set up
NX, which is similar to VNC in that it provides an interface to the remote machine's desktop that you can use exactly as though it were your own desktop. We found a
Google Groups article by Eric Hammond very useful in setting up NX, and thought it was preferrable to VNC because of the default encryption method and insistance on avoiding the root user.
Performance-wise, our Drupal machine has been running Apache 2, MySQL 5, and a host of monitoring software (Munin, Nagios and AWStats) so that we can keep an eye on things, and it has been running for around a month so far with no outages, crashes or other problems at all. The learning curve is pretty steep and the documentation is fairly sparse, but there is a very active community out there on the AWS forums and places like Google Groups. Overall we are very impressed with Amazon as a Drupal hosting environment and although not entirely convinced at the current time, will be looking towards moving more and more sites over there in the future.