Showing posts with label drupal. Show all posts
Showing posts with label drupal. Show all posts

Tuesday, 30 June 2009

Drupal for Publishers, London, 30th June 2009

I've just returned from the Drupal for Publishers event held at Sun in London. 100 attendees, including a good mix of Drupalites and potential Drupal users, were presented with a series of talks on a range of issues relating to using Drupal to create websites (and other solutions) for newspapers and magazines.

Sun were great hosts and despite the event being free to attend, a lavish array of sandwiches, fruit and cake was laid on, washed down with a selection of tea, coffee and fruit juice. The venue was great too, with a very large, clear projector, and a nice cool room in spite of the scorching weather in the City.

For me in particular, the most useful talk of the day was Stewart Robinson's on some of the development and management techniques being used by the team redeveloping the Economist in Drupal. It was good to discover that even on large-scale projects, the usual Drupal problems can rear their ugly heads (everything is in the database, like views, so making testing changes live can be problematic), and it was good to hear about some of the proposed solutions, such as putting everything in the code (views, content types, CCK field definitions, etc) which is then kept under version control, and using a contributed module to generate the database structure or records from the code.

For a lot of smaller Drupal developments, corners are cut to save time or money, but the Economist's approach obviously has to be extremely comprehensive, including full unit testing and browser testing, but these words can often be thrown about very conceptually. Stewart presented some very tangible methods of implementing these (simpletest and Selenium, in this case), and these are valuable insights that are often found only occasionally while scouring the personal blogs of Drupal developers.

The other talks were entertaining but most were of less value to me personally, serving as an introduction to Drupal and what its capabilities are in the online publishing world. No doubt a large proportion of the audience would have found this relevant, and I think it was good to provide a little bit that everyone could take away with them.

The penultimate talk was on the development of a solution for IPC Media. Although it was a Drupal solution, it showcased how, in this case, Drupal was being used for its administration and data entry interface, completely ignoring the front-end side of it. Although it was interesting to learn that this kind of thing can be achieved, the audience was then shown custom modules that could provide a graphical image uploader, selector and cropper, but told that these modules were not available to the wider community.

Unfortunately, I found this was like having to watch a man guzzle a nice cold beer after I've just crossed a desert on foot. These are certainly modules that would be very useful additions to the Drupal community, and it seems odd to bring to light the existence of such modules at an open source event and then snatch them away again behind lock and key.

Overall, I'd like to thank and commend Sun for hosting such a useful event and the individual speakers and organisers for their hard work. I think smaller events like this, as well as the twice-yearly DrupalCon, are key to the continuity and expansion of the Drupal community.

Wednesday, 10 June 2009

Drupal 5: Automatically assign a role on user profile edit

Using Drupal 5, I recently had cause to create a system whereby when a user updates checkboxes in his or her profile, roles would automatically be assigned or unassigned. No problem, I thought, I would just use hook_user() to achieve this. According to the API, I would need the two $ops insert and update.

Writing the one for update was easy. The $account contained all the user's profile fields (the ones starting with profile_) and the roles could be assigned based on these (1 for add role, 0 for delete role).

I ran into a problem with insert. The $account contained the keys for the profile fields, but the values were all blank! I still don't really know why this is. The solution, as cumbersome as it might be, is to wait until the new user has a uid, then call user_load() on the user, at which point the profile fields will have their proper values. Then, exactly the same method can be used as in the update case.

As a footnote, we don't actively develop in Drupal 5 any more; all of our development occurs in Drupal 6, but we still support Drupal 5 sites. Here is the finished code in case anyone finds it useful:



/**
* Implementation of hook_user().
*/
function mymodule_autorole_user($op, &$edit, &$account, $category = NULL) {
if ($op == 'insert') {
mymodule_autorole_apply_roles($account->uid);
}
else if ($op == 'update') {
mymodule_autorole_apply_roles($account->uid);
}
}

/**
* Takes a user account object and uses it to update the user's roles.
*
* @param $uid
* A fully populated user account object such as one returned by user_load().
*/
function mymodule_autorole_apply_roles($uid) {
$account = user_load(array('uid' => $uid));

// Filter out the profile fields from the account information.
$profile_fields = array();

foreach ($account as $fieldname => $field) {
// Split up the field name by the underscore character. The field names we
// are looking for are named like profile_something, but they could be
// profile_something_something, so join back together after the split.
$pieces = explode('_', $fieldname);

if (array_shift($pieces) == 'profile') {
$profile_fields[implode('_', $pieces) . ' club member'] = $field;

}
}

$myaccount = user_load(array('uid' => $account->uid));
$roles = user_roles();

foreach ($profile_fields as $field => $value) {
if ($value) {
// The checkbox was checked, or the textfield had something in it. Add a
// new role corresponding to this, if there is one.
foreach ($roles as $key => $role) {
if ($role == $field) {
$myaccount->roles[$key] = $role;
}
}
}
else {
// The checkbox was unchecked, or the textfield was empty. Unset the user
// role corresponding to this, if there is one.
foreach ($roles as $key => $role) {
if ($role == $field) {
unset($myaccount->roles[$key]);
}
}
}
}

// Update the user with the new role assignments.
db_query("DELETE FROM {users_roles} WHERE uid = %d", $myaccount->uid);

foreach ($myaccount->roles as $rid => $role) {
db_query("INSERT INTO {users_roles} (uid, rid) VALUES (%d, %d)", $myaccount->uid, $rid);
}
}

Wednesday, 20 May 2009

Drupal 5: Problems with unserialize in bootstrap.inc

Today I had a problem where Drupal 5 kept reporting a problem with unserialize in bootstrap.inc on line 428.

On closer inspection, bootstrap.inc, which is in the includes directory in the root of Drupal 5, contains a number of functions that are used when Drupal 'boots up'. The function in question here was variable_init(), and this is where all variables are drawn from the database. These variables are in serialized form, allowing Drupal to store anything, from objects to arrays, in string format.

If the serialized item got corrupted somehow, it wouldn't be able to unserialize properly in variable_init(), leading to this error. My problem was that I was not able to see which variables were causing the problem; only that the problem existed. With over 200 variables on the site, manually checking each one for valid serialization was not a viable option!

My solution was to change the core bootstrap.inc to print the names of the offending variables, thereby enabling me to find them in the database and fix them. Here's the original snippet from line 427 of bootstrap.inc:


while ($variable = db_fetch_object($result)) {
$variables[$variable->name] = unserialize($variable->value);
}

Here's what I changed it to, temporarily:


while ($variable = db_fetch_object($result)) {
if (($variables[$variable->name] = unserialize($variable->value)) === FALSE) {
print $variable->name;
}
}

Once I had my variable names, I was able to find them in the database, using phpMyAdmin, and edit them. I found that I had something like s:5:" in there. This had been truncated, and should have been something like s:5:"hello". The first letter indicates that the variable is a string. The number indicated how many characters the string has, and the value of the string is encapsulated within double quotes.

Afterwards, I just changed bootstrap.inc back to the way it was before, and my unserialize problems vanished!

Friday, 13 February 2009

Drupal on Amazon web hosting

Cloud computing has been around for a while, but only recently have we, the general populace, had access to it. Amazon offer one such manifestation: an environment where it's possible to set up any number of virtual dedicated servers and use them for hosting, in our case, Drupal sites! I wanted to share some of my experiences (both good and bad) using the Amazon cloud, so you can make a better decision about whether it's right for your Drupal sites.

This is attractive compared to paying for a shared hosting account. One can never quite be sure what else is running on the system you're sharing, leading to potential performance woes. The Amazon EC2 (Elastic Compute Cloud), as they call it, appears more attractive than smaller companies offering VPSs (Virtual Private Servers) because of the sheer scale of Amazon. It is unlikely to disappear tomorrow and is backed up by the excellent S3 (Simple Storage Service) for backing up data.

Having said that, Amazon solutions can be costly. At the time of writing, Amazon's most modest offering, a reasonably specced dedicated machine with 1.8GB of memory, goes for 11 cents ($0.11) per hour. This doesn't sound like much, but there are 168 hours in a week, and based on the 4.3-week month, that's 722 hours per month, or about $80 per month. Writing from the UK, with the exchange rates as they are at the time of writing, this is about £55.

Still, the alternative for us was to purchase a new machine and buy some rack space (which is billable in advance, often for several months). Compared to Amazon, who bill for usage at the end of each calendar month, with no initial hardware cost, the choice seemed clear.

Starting out, it's striking how little documentation there is. Concepts like elastic IPs, keypairs and elastic block stores are very alien, even to the average techie, and whilst there is introductory material, it feels incomplete. Since the Amazon system is fairly new, and is quite pioneering in its approach, this is understandable, but doesn't make the task any easier.

One of the biggest surprises is that, at the time of writing, Amazon's own web interface does not allow the management of EU-based instances (virtual machines), despite allowing control over US-based ones. If there's one thing that really winds people up on this side of the Atlantic, it's that Americans (and often Canadians) are given preferential treatment with this type of thing. Nevertheless, we were soon able to locate the excellent ElasticFox firefox extension, which allows management of European instances, and we had our very own fresh copy of Ubuntu installed and running at the touch of a button.

This is incredibly powerful stuff, especially because in theory, you can launch as many virtual servers as you like with the click of a button. In practice, Amazon impose sensible limits (although you can apply for more if you genuinely need them) and after all, you're paying per hour for all these machines. We found Alestic's site very valuable indeed. It allows you to quickly find the right Amazon Machine Image (AMI) to use when starting your system, rather than poring over a huge list. A lot of these systems come pre-installed with all the things you're likely to need. There weren't any with specific Drupal installs, but since this is only a 5-minute job (download, unzip), it wasn't too much of an issue.

When you turn off (terminate) an instance, or the power fails at Amazon (very rare but can happen), the instance disappears completely, and so does its local storage. This is a concept alien to those unaccustomed to a VPS environment. After all, if I turn off my laptop right now, the data will be saved to the hard disk and when I start it up again, it's all there. This is not so with Amazon's hosting. The virtual machine powers down and all data stored locally, including program setting and even the operating system itself, is gone forever.

With this in mind, it's clear that a backup solution is needed. Luckily, there are some decent tools in the Amazon EC2 AMI tools package, which is pre-installed on many of Alestic's images. The idea is that you regularly take an image of the entire machine and copy it to S3 where it can be stored safely, and permanently.

Writing a simple script to do this proved more difficult, however. Firstly, it wasn't clear from the documentation that we needed to explicitly state that we wanted to back up the data to an EU S3 bucket. Without this option, the files were sent to a US bucket, taking a very long time indeed and costing $0.17 per gigabyte (a machine image is usually at least 1GB, if not more). Secondly, the bundling process, as Amazon calls it, is less than reliable. Sometimes it would just bail for no apparent reason. Sometimes it would bundle the image and fail during the upload process, again, for no apparent reason. I personally still don't trust the automated backup script I wrote because of these shortcomings, so I find myself checking manually a lot of the time, which diminishes the value of an automated system.

The next concept that was slightly alien was the Elastic Block Store (EBS), which is a system whereby it's possible to create virtual hard disks and mount them to your instances. This is much better than storing files on the instance itself, because if the instance dies, your data are safe. It's possible to take what Amazon calls snapshots of the volumes, enabling a simple backup system, and again, this process can be automated, but you will need to know your way around a shell script, since this is not a point-and-click affair.

The EBS makes it easy to split data into different volumes (database volume, websites volume, miscellaneous volume, etc). Initially we wanted to run MySQL and Apache on the same low-traffic system to see how good Amazon really was, but we always wanted the ability to migrate MySQL to a dedicated machine at a later date. It's a doddle with Amazon: you can simply unmount the database EBS volume from one machine and mount it to another.

We have used the EBS to store some of our more persistant configuration settings too, such as Apache configuration, Apache log files and configurations for the awesome Nagios monitoring system.

To administer these Amazon systems, shell access is needed at a minimum. Unlike other systems where it's possible to simply connect on port 22, Amazon uses a keypair system. Each instance must be created with a specific keypair, and then a key must be downloaded and used with the terminal application (such as PuTTY) before it will allow you to connect. Terminal is nice and all, but sometimes it's useful to do more with the system, like have multiple terminals open or use a GUI tool (like MySQL Administrator). For this, we set up NX, which is similar to VNC in that it provides an interface to the remote machine's desktop that you can use exactly as though it were your own desktop. We found a Google Groups article by Eric Hammond very useful in setting up NX, and thought it was preferrable to VNC because of the default encryption method and insistance on avoiding the root user.

Performance-wise, our Drupal machine has been running Apache 2, MySQL 5, and a host of monitoring software (Munin, Nagios and AWStats) so that we can keep an eye on things, and it has been running for around a month so far with no outages, crashes or other problems at all. The learning curve is pretty steep and the documentation is fairly sparse, but there is a very active community out there on the AWS forums and places like Google Groups. Overall we are very impressed with Amazon as a Drupal hosting environment and although not entirely convinced at the current time, will be looking towards moving more and more sites over there in the future.

Tuesday, 9 December 2008

Drupal 6: Core hacking for the simplest of things

I was recently faced with a problem concerning changing something in Drupal's core user module. This wasn't even a big change; it was as simple as changing the line of text that appears underneath the username field on the registration form. You know, the one that helps the user to choose an appropriate username.

The core has no option for changing this text, so the most obvious and worst thing possible is to open up user.module in the core and edit the string. This is bad because the first rule of Drupal club is do not hack the core. Actually, the second rule of Drupal club is... you get the idea. It would mean that when Drupal is upgraded, the change will disappear.

After ruling out the above method, I was faced with a number of opportunities to accomplish what I needed, but they all came with downsides.

Override the core


This involves making a copy of the core user module inside my site's modules folder. Drupal will then use the copy instead of the one in the core and changes can be made safely to this copy, because it is not going to be upgraded when a new version of Drupal is released, unlike the core itself.

My main concern here is that technically it's a core hack in disguise. In my case, I would be making a copy of Drupal 6.6's user module and then changing it. But, what happens when Drupal gets upgraded to 6.7? The core user module will be 6.7, but the site won't be using it. Any new or changed functionality in the user module will be missing from the site, and worse, this could introduce incompatibilities.

Suppose something has changed in 6.7's user module, and the other core modules expect this change to be in place? This would break Drupal, because it's not expecting to encounter the 6.6 user module here. How likely is this to happen? I'd say very unlikely in an upgrade from 6.6 to 6.7, but still possible. More than that, it's far more likely to happen once 6.7 becomes 6.8, 6.9 and so forth.


Locale


Drupal's translation system makes it possible to provide a translation string for the text underneath the username field, as though it were being translated into French or Italian, but this feels wrong. It's not a ‘real’ translation because we just want to change the string itself, not change it into another language.

String Overrides


There exists a module called string overrides. It provides a way of changing any text that is passed through the t() function, by saving these translations in the variables table. This seemed like an ideal solution at first, but on closer examination, there are a few drawbacks.

Firstly, since the overrides are stored in the database, we lose version control, which is never a good thing. Secondly, the more modules you add to a site, the more overhead there is, both in terms of memory and in terms of processing times.

Custom module


It would be possible to write a custom module specifically for the site in question. The module would mimic the functionality of the string overrides module, but store the overrides in a file instead of the database, thereby providing version control. The only slight downside to this is that it requires a lot more time than the other approaches. It seems like extreme overkill to have to write an entire module simply to change one string.

To conclude


In the end we decided to go with the core override method. I am not entirely happy with this, but I am sure I would not have been entirely happy with any of the solutions I've presented here. I can acknowledge and understand that there are many compromises to be made on the way to a completely custom Drupal site, but there is always room for improving Drupal to accommodate many of the more common requests. The questions are: is this one of the more common requests? and: what is the best way forward here?

Wednesday, 12 November 2008

Drupal: Pitfalls when converting CCK field modules from Drupal 5 to Drupal 6

I recently needed to convert a CCK field module from Drupal 5 to Drupal 6. There is a lot to take in, but I was faced with a particular problem:

warning: array_shift() [function.array-shift]: The argument should be an array in D:\wamp\www\drupal6\includes\form.inc on line 1320.

Oh dear. It's always difficult to debug this kind of thing because the problem lies in the code that called the function on line 1320 of form.inc, rather than there being a problem with form.inc. I used debug_backtrace() to see what parameters the previous functions in the call stack were using, and noticed that only the first two parameters in _form_set_value() were populated; the other two were NULL. This was the immediate source of the error. The following line was failing because $parents was NULL:

$parent = array_shift($parents);

Obviously you can't array_shift() NULL. I then went back further in the backtrace, to the form_set_value() function (note the function is not preceded by an underscore, like the last one). In this function, the second parameter ($value) was NULL. This was causing the NULL in the _form_set_value() function.

The solution
It turns out that I was using this code to handle the form widget's processing in my CCK field module:

/**
* Process the mcimage element.
*/
function mcimage_mcimage_process($element, $edit, $form_state, $form) {
$field_name = $element['#field_name'];
$field = $form['#field_info'][$field_name];
$field_key = $element['#columns'][0];
$value = isset($element['#value'][$field_key]) ? $element['#value'][$field_key] : '';

$element[$field_key] = array(
'#type' => 'hidden',
'#default_value' => $value,
// The following values were set by the content module and need
// to be passed down to the nested element.
'#title' => $element['#title'],
'#description' => $element['#description'],
'#required' => $element['#required'],
'#field_name' => $element['#field_name'],
'#type_name' => $element['#type_name'],
'#delta' => $element['#delta'],
'#columns' => $element['#columns'],
);
}
My mistake? The function does not return anything! It was as simple as adding a return statement at the end of the function, so that the form element could be processed correctly:

/**
* Process the mcimage element.
*/
function mcimage_mcimage_process($element, $edit, $form_state, $form) {
$field_name = $element['#field_name'];
$field = $form['#field_info'][$field_name];
$field_key = $element['#columns'][0];
$value = isset($element['#value'][$field_key]) ? $element['#value'][$field_key] : '';

$element[$field_key] = array(
'#type' => 'hidden',
'#default_value' => $value,
// The following values were set by the content module and need
// to be passed down to the nested element.
'#title' => $element['#title'],
'#description' => $element['#description'],
'#required' => $element['#required'],
'#field_name' => $element['#field_name'],
'#type_name' => $element['#type_name'],
'#delta' => $element['#delta'],
'#columns' => $element['#columns'],
);

return $element;
}

Thursday, 6 November 2008

Drupal 5: user_save and profile fields

I was recently required to import a large number of users into a Drupal 5 site, so I wrote a simple import module to take rows from a CSV file and pass them to user_save(). In addition to the basic user information in the {users} table, I needed to create several profile fields too. This was incredibly complicated, but probably shouldn't have been.

The first thing I noticed is that the documentation for user_save() is not exactly stellar.

$account The $user object for the user to modify or add. If $user->uid is omitted, a new user will be added.

Fine, but is this where I should be putting my new user's information, or perhaps I should use the next parameter?

$array An array of fields and values to save. For example array('name' => 'My name'); Setting a field to NULL deletes it from the data column.

Ok, fine. Maybe I should use this one for adding my new user data. This doesn't mention anything about the profile fields though, or explain what I should be doing with the $account parameter. Maybe there's something else?

$category (optional) The category for storing profile information in.

What? Category? Now I'm really confused. There's nothing obvious in user_save() that suggests how the profile fields get saved, or even where to put the profile fields. My only real clue is the call to user_module_invoke() towards the end of the function. This calls hook_user() in all the active modules on the site, and the one I'm interested in is the profile module, so my next stop was profile_user(). In turn, this calls profile_save_profile() with the details from the original call to user_save().

It was at this stage that I noticed that $category must refer to the various groups that profile information can be put in. For example, you can create a category for personal information, and one for notification preferences, and doing so will split the fields onto different tabs when the user edits his or her profile. Unfortunately, $category is a string, not an array, so for each call to profile_save_profile(), only one category can be changed.

Because profile_save_profile() is only called once per user_save(), it appears that when creating a user, it is only possible to create profile fields in one group! This causes a problem for me because I needed to import lots of profile fields in several groups.

My solution was to temporarily move all the profile fields into a single group. Once I had done that, I could populate $array with the information destined for the {users} table and the profile fields (this was not documented anywhere). It turns out I could just use NULL for $account (again, this was not documented).

Surely this is not the ideal way of creating new users programatically. My solution does work, but it is annoying and time-consuming. Is there another way to create users with profile fields in multiple categories?

Tuesday, 4 November 2008

Drupal 5: Don't rely on the node's path attribute

I recently noticed that some of the nodes on a site I was working on were not linking properly from the teaser to the full node view. It turns out that the hyperlink tag's href was empty. Checking the template for the node teaser, I found that it was using this:


print l($node->title, $node->path);


This appears to work fine for nodes with a path alias set up (via the path or pathauto modules), but not for other nodes, because the $node->path part was empty. I believe the following code should have been used, to always start with the base URL for the node, and let the l() function choose the most appropriate alias:


print l($node->title, 'node/' . $node->nid);

Tuesday, 28 October 2008

Drupal: Broken RSS feeds

In Drupal 5, the path www.example.com/rss.xml will produce an RSS feed of the most recent nodes (the number of items can be customised in the RSS publishing options; the default is ten), and other core modules provide feeds, such as taxonomy, via www.example.com/taxonomy/term/1/0/feed. I had a problem recently where this functionality appeared to stop working. RSS feeds would not be displayed, and a ‘page not found᾿ (404) error would appear instead.

At first, I suspected a module might have been to blame. After all, the RSS functionality worked on a simple site, but not the one I was working on, which had been customised with lots of modules. I disabled all the modules that could have possibly affected the RSS feeds, but this did not help at all.

It turns out that URL aliasing was to blame. I had inserted some rows manually into the {url_alias} table, but I had put the values into the wrong columns. The two columns in question were src and dst. The dst column is the column containing the URL the user goes to, and the src column is the one that it is aliases to, ie the ‘real’ URL. I had put the values into these columns the wrong way around, so when requesting the RSS feed's URL, Drupal was redirecting to a gibberish URL that did not exist!

Friday, 24 October 2008

Drupal: Allowing users to edit roles

In Drupal 5, I needed some way of giving users with a specific role permission to set the roles of others. It turns out that by default, the administrator is the only one who can assign roles to users. I also found the user_selectable_roles module by Bacteria Man, which allows users to assign themselves roles, but this was not quite what I wanted.

After some digging around in the core, specifically user.module, I found that when the user edit page is displayed, the system checks whether the user (the user who is seeing the edit page, not the user being edited) has permission to administer access control, and if so, grants them the ability to edit roles.

It turns out that in order to edit roles, the user must be given privileges to edit the access rights of all the roles, which is not really the same thing! In my opinion, there should be sufficient granularity between assigning roles to users and deciding what permissions those roles have. Perhaps this is something for a future Drupal release? It might even exist in Drupal 6 or Drupal 7, but not having played with them yet, I don't know.

Wednesday, 22 October 2008

Drupal: Search indexing manually imported nodes

Recently I was required to import about 7000 nodes into Drupal 5. I was not entirely sure how to achieve this, but I decided to use SQL scripting. As it turns out, a more proper method would have been to use a PHP script to build nodes and use node_save() to accomplish the import.

That aside, I ran into a problem where none of the manually imported nodes were appearing in search results. Nodes created within Drupal worked fine, appearing in search results as expected. My manually imported nodes worked fine on the site itself: by nagivating to node/1234 it was possible to view the node as expected, and they worked fine with Views, but there was nothing in the search results.

My first stop was to tell Drupal to regenerate the search index, by visiting admin/settings/search and using the functionality there. The correct number of nodes was reported on this page, but after clicking the reindex button and running cron.php, it reported that 97% of the nodes had been indexed. This was far too short a time for this to happen, and a quick search showed that the manually imported nodes were still not appearing in search results.

I began to speculate that because the manually imported nodes were last modified (the changed column in the database) before the last time that the indexer was run, they weren't being ‘seen’ by the indexer, since it only pays attention to things that have changed since last time.

I did try backing up the {node} table, setting the changed timestamp for all nodes to something after the last time the indexer was run, then reindexing everything again, but this seemed to make no difference. If I didn't know better, I would have said that Drupal was taking one look at the sheer amount of stuff it was going to have to index, and freaking out!

In the end, my solution was to create a simple module that took advantage of the _node_index_node() function from Drupal 6. This seemed to work with no compatibility issues (I was using Drupal 5 for this), and my simple module checked to see if the node's nid existed in the {search_dataset} table (as a sid, not a nid), and if it didn't, called _node_index_node() to forcibly index it. It was a very slow process but it appears to have done the trick.

I suppose the question I am left with after this project is: why didn't Drupal reindex these nodes? Was my suspicion about the indexer not ‘seeing’ them correct? Should one always import nodes using PHP and Drupal's node_save()? While my method works, it is cumbersome and I would always seek the most efficient way of doing this, so please leave a comment if you can elaborate on this!

Download the completed module (2K).