Friday 13 February 2009

Drupal on Amazon web hosting

Cloud computing has been around for a while, but only recently have we, the general populace, had access to it. Amazon offer one such manifestation: an environment where it's possible to set up any number of virtual dedicated servers and use them for hosting, in our case, Drupal sites! I wanted to share some of my experiences (both good and bad) using the Amazon cloud, so you can make a better decision about whether it's right for your Drupal sites.

This is attractive compared to paying for a shared hosting account. One can never quite be sure what else is running on the system you're sharing, leading to potential performance woes. The Amazon EC2 (Elastic Compute Cloud), as they call it, appears more attractive than smaller companies offering VPSs (Virtual Private Servers) because of the sheer scale of Amazon. It is unlikely to disappear tomorrow and is backed up by the excellent S3 (Simple Storage Service) for backing up data.

Having said that, Amazon solutions can be costly. At the time of writing, Amazon's most modest offering, a reasonably specced dedicated machine with 1.8GB of memory, goes for 11 cents ($0.11) per hour. This doesn't sound like much, but there are 168 hours in a week, and based on the 4.3-week month, that's 722 hours per month, or about $80 per month. Writing from the UK, with the exchange rates as they are at the time of writing, this is about £55.

Still, the alternative for us was to purchase a new machine and buy some rack space (which is billable in advance, often for several months). Compared to Amazon, who bill for usage at the end of each calendar month, with no initial hardware cost, the choice seemed clear.

Starting out, it's striking how little documentation there is. Concepts like elastic IPs, keypairs and elastic block stores are very alien, even to the average techie, and whilst there is introductory material, it feels incomplete. Since the Amazon system is fairly new, and is quite pioneering in its approach, this is understandable, but doesn't make the task any easier.

One of the biggest surprises is that, at the time of writing, Amazon's own web interface does not allow the management of EU-based instances (virtual machines), despite allowing control over US-based ones. If there's one thing that really winds people up on this side of the Atlantic, it's that Americans (and often Canadians) are given preferential treatment with this type of thing. Nevertheless, we were soon able to locate the excellent ElasticFox firefox extension, which allows management of European instances, and we had our very own fresh copy of Ubuntu installed and running at the touch of a button.

This is incredibly powerful stuff, especially because in theory, you can launch as many virtual servers as you like with the click of a button. In practice, Amazon impose sensible limits (although you can apply for more if you genuinely need them) and after all, you're paying per hour for all these machines. We found Alestic's site very valuable indeed. It allows you to quickly find the right Amazon Machine Image (AMI) to use when starting your system, rather than poring over a huge list. A lot of these systems come pre-installed with all the things you're likely to need. There weren't any with specific Drupal installs, but since this is only a 5-minute job (download, unzip), it wasn't too much of an issue.

When you turn off (terminate) an instance, or the power fails at Amazon (very rare but can happen), the instance disappears completely, and so does its local storage. This is a concept alien to those unaccustomed to a VPS environment. After all, if I turn off my laptop right now, the data will be saved to the hard disk and when I start it up again, it's all there. This is not so with Amazon's hosting. The virtual machine powers down and all data stored locally, including program setting and even the operating system itself, is gone forever.

With this in mind, it's clear that a backup solution is needed. Luckily, there are some decent tools in the Amazon EC2 AMI tools package, which is pre-installed on many of Alestic's images. The idea is that you regularly take an image of the entire machine and copy it to S3 where it can be stored safely, and permanently.

Writing a simple script to do this proved more difficult, however. Firstly, it wasn't clear from the documentation that we needed to explicitly state that we wanted to back up the data to an EU S3 bucket. Without this option, the files were sent to a US bucket, taking a very long time indeed and costing $0.17 per gigabyte (a machine image is usually at least 1GB, if not more). Secondly, the bundling process, as Amazon calls it, is less than reliable. Sometimes it would just bail for no apparent reason. Sometimes it would bundle the image and fail during the upload process, again, for no apparent reason. I personally still don't trust the automated backup script I wrote because of these shortcomings, so I find myself checking manually a lot of the time, which diminishes the value of an automated system.

The next concept that was slightly alien was the Elastic Block Store (EBS), which is a system whereby it's possible to create virtual hard disks and mount them to your instances. This is much better than storing files on the instance itself, because if the instance dies, your data are safe. It's possible to take what Amazon calls snapshots of the volumes, enabling a simple backup system, and again, this process can be automated, but you will need to know your way around a shell script, since this is not a point-and-click affair.

The EBS makes it easy to split data into different volumes (database volume, websites volume, miscellaneous volume, etc). Initially we wanted to run MySQL and Apache on the same low-traffic system to see how good Amazon really was, but we always wanted the ability to migrate MySQL to a dedicated machine at a later date. It's a doddle with Amazon: you can simply unmount the database EBS volume from one machine and mount it to another.

We have used the EBS to store some of our more persistant configuration settings too, such as Apache configuration, Apache log files and configurations for the awesome Nagios monitoring system.

To administer these Amazon systems, shell access is needed at a minimum. Unlike other systems where it's possible to simply connect on port 22, Amazon uses a keypair system. Each instance must be created with a specific keypair, and then a key must be downloaded and used with the terminal application (such as PuTTY) before it will allow you to connect. Terminal is nice and all, but sometimes it's useful to do more with the system, like have multiple terminals open or use a GUI tool (like MySQL Administrator). For this, we set up NX, which is similar to VNC in that it provides an interface to the remote machine's desktop that you can use exactly as though it were your own desktop. We found a Google Groups article by Eric Hammond very useful in setting up NX, and thought it was preferrable to VNC because of the default encryption method and insistance on avoiding the root user.

Performance-wise, our Drupal machine has been running Apache 2, MySQL 5, and a host of monitoring software (Munin, Nagios and AWStats) so that we can keep an eye on things, and it has been running for around a month so far with no outages, crashes or other problems at all. The learning curve is pretty steep and the documentation is fairly sparse, but there is a very active community out there on the AWS forums and places like Google Groups. Overall we are very impressed with Amazon as a Drupal hosting environment and although not entirely convinced at the current time, will be looking towards moving more and more sites over there in the future.

10 comments:

Anonymous said...

Hi,

I would recommend "Host Gator" for your web hosting because it's very cheap and reliable. Their support service is awesome. This company offers site building tools, application vault, web mail, antivirus, Php, MySQL ,e-commerce and hosts unlimited websites in one plan. Don't go for free web hosting.They are not reliable. Check it out here (read the User Reviews) :
http://top50webhosting.org

adrin said...

Nice post. Very innovative.

Web hosting India

kamlavati said...

Get Dedicated Server and Cloud Server Hosting Services backed by 24X7 Technical Support, Uptime Guarantee & Fanatical

support at Chillyz.com

mozoella said...

All web hosting services are good and fine but some are better and recommendable than others. For me i think VPS hosting is good, its cheap and very reliable too. Thanks for the post, it contains a whole lot of information
VPS Hosting

Unknown said...

I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own BlogEngine blog now. Really the blogging is spreading its wings rapidly. Your write up is a fine example of it.
web hosting

Unknown said...

Excellent text. I am very grateful that I had a chance to look at.
web hosting
web dizajn
vps hosting
izrada web stranica
izrada internet stranica
vps hosting
vps server
windows vps
hyper-v vps
windows hosting
dedicated server

Cloud Hosting India said...

Hello,
Nice blog i like it
If explained in a layman's term, cloud server hosting is a technology which allows you to store your data over the internet.

Cloud Hosting India

Anonymous said...

This is just the information I am Finding everywhere. Thanks for your blog, Thanks for posting this informative article. it is helping more .
Domain registration

Unknown said...
This comment has been removed by the author.
Steve Frank said...
This comment has been removed by the author.