Configuring Drupal's Database

Submitted by dave on Tue, 2006-05-23 22:49.

One of the challenges facing we Drupal users is configuring a site just the way we want it, then applying that configuration to another site, and another and so on. This problem applies to those of us developing a single site for a customer (where we have our own development copies, and the installed live copy), and also to those making entire distributions based on Drupal (where untold numbers of the site will be installed). Lately Drupal developers have referred to such a site configuration as an 'install profile'.

This sounds like a simple task, so why is it a challenge? In large part because the configuration that goes into any site is partly on the filesystem, and partly in the database. The filesystem part is easy; simply make sure each site installation has the same files. As development goes on, source code control systems help keep files in sync across multiple sites. With a database, its easy enough to start each site with the same data. But as development goes on, when the database needs changing, you have a problem on your hands. I'll cite an example which may make this clear.

Example: Adding a Block

Let's say Alice and Bob are collaborating on a site (or install profile, if you prefer), and Charlie is using the site. Alice and Bob share their work through a source code control system. Periodically they send Charlie updates. Let's talk specifically about one update...

Charlie requested a new feature: he wants some content to appear, on some pages, just above the footer. Alice and Bob implement the feature by adding a block to show the content, and adding a new region to their theme where the block will appear.

First, Alice adds the region to the theme. This requires some modification to a template.php file. Alice checks the new code in, and Bob checks out that update. This is an example of filesystem configuration, and easily enough the two sites are in sync.

Next, Bob configures a block to appear in the new region. His site now has the complete feature Charlie requested, but the block configuration is in his database. He configured the block using the administrator pages. Does Alice now have to do the same thing? Does Charlie have to also repeat those steps when he updates his site?

Those of you familiar with Drupal's update system may point out a solution: Bob could write a hook_update function that adds the block. This way when Alice or Charlie run the update script, they will have the new block. Bob could do this, but he'd really have to know the ins and outs of the block database schema. And when he makes other changes he has to be just as knowledgable. Other changes that would require hook_update code include adding menu items, vocabularies and terms, views and nodes (and many other examples). Is it really realistic for anyone to write hook_update code to accomplish all this? Especially considering that many of the database IDs may not be the same from site to site. (For example a new vocabulary added by Bob may have an vid already used on Charlie's site).

Solution

Hopefully that example makes the problem clear. What follows is how I'm trying to address the problem in my own work. To use the term 'solution' to describe this is overkill. It's just my latest approach. It's not perfect, but it works well for me.

(Why it's not perfect will become clear when we get into details, but I'll tell you right now: (a) it requires a patch to Drupal core and (b) the work I've done applies to mysql only.)

Before getting into the nitty-gritty, I'll quickly describe the big picture.

Install Profile

Each site has what I will call the "profile" and any number of "installations". The profile is the configuration, without the content. Or to be more proper - it's the configuration plus the content that applies to every installation. Because as we just described, the line between what is configuration and what is content is very blurry in Drupal's database.

Approach

The goal is be able to update the profile, and propagate those updates to each installation, without damaging the content already present on the installation.

To accomplish this, we'll reserve a chunk of database IDs for the profile. Only the profile can write to these IDs, each installation writes outside the reserved chunk. For example, node IDs 1 through 1000 belong to the profile. When Charlie adds his first node, its nid will be 1001. The initial profile may only have a few nodes, not nearly 1000, but that is the point. Later, new nodes can be added to the profile, and slipped into each installation, without overwriting the installation's existing nodes. And of course what we do for nodes we can do for blocks, vocabularies, terms, and other items in the Drupal database.

So the install profile consists of (a) Drupal core plus add-on modules plus themes - in other words, lots of files; and, (b) a database dump. This database dump is similar to Drupal's database.mysql, but includes some nodes, vocabularies, etc. The database dump is applied when first installing the site, and may be applied again later to include additions to the profile.

Details

So how can we tell Drupal to reserve the first 1000 ids for the profile? Unfortunately it requires some patches. But this won't be to bad.

How does Drupal assign it's ids? Two ways, it turns out. Some modules rely on the sequences table, while others rely on the database via autoincrement. We have to address both of these approaches.

First let's discuss the autoincrement approach (because its easier). Code like this is all it takes (at least with MySQL - I'm no postgres expert):

ALTER TABLE access AUTO_INCREMENT = 1000;

Of course you'll need that line for each table that uses autoincrement. Attached to this post is a script which takes care of it automagically.

Next, the tables that use the sequences table. You might think this would be enough to do the trick:

UPDATE sequences SET id=1000 WHERE id<1000;

And it almost is, but there is a drawback. Let's say your profile has not defined any nodes. There would be no node_nid entry in the sequences table. So in developement sites, nid's would start at 1. Later we might add a node to the profile, and it would collide with nodes that may have been added to other sites. So to be very careful, we must ensure that all ids start at 1001. To do this we must patch code.

Specifically, in db_next_id(), we replace the line:

$id = db_result(db_query("SELECT id FROM {sequences} WHERE name = '%s'", $name)) + 1;

With:

$id = db_result(db_query("SELECT max(id) FROM {sequences} WHERE name = '%s' OR name = '_sequence_start'", $name)) + 1;

And we alter the database:

INSERT INTO sequences (name, id) VALUES ('_sequence_start', 1000);

Now all ids in our profile start with 1, while all ids in our installed sites start with 1001. But there's still a problem. Some tables in Drupals database have no ids, and in fact have no unique keys at all. For reasons that should be clear later (if not now), all tables used in our install profile must have a unique key.

So our final patch is to apply unique keys of some sort to all tables that do not have them. How we do this varies from table to table. I've attached a file illustrating how I do this for Drupal and the modules I use. Your mileage may vary. Here's just one example:

ALTER TABLE filters ADD PRIMARY KEY (format, module, delta);

Whew! We're done with the nasty details. What follows is a more detailed description of the steps involved.

Step 0: Install and Patch

Start with a vanilla installation of Drupal. Install the files and create a database as you normally would.

Apply the patch described above to database.mysql.inc.

Alter your database tables to ensure that each table has a primary key.

Step 1: Configure the Profile

Create the first user and start configuring the profile. Apply any settings that will apply to all installions (development and production) of the site. For example, you might want to set file handling to 'private', create a vocabularies, populate the primary links, enable modules and themes, select a default theme, create an 'access denied' page, enable a view, etc. The list goes on and on and on.

Remember that the settings you apply and content you add to the profile will apply to every site, so choose carefully.

Finally run this script to dump the database settings. You may want to check the results into source code control, or somehow distribute them along with the Drupal files and modules. The script takes a single parameter which is the path to a settings.php file. It includes that file to learn the database name and login parameters. It dumps the result to stdout. You might run it like so:

php drupal_base_db path/to/sites/mysite/settings.php > /path/to/save/dump.mysql

Step 2: Install the Site

First get all the files installed and create an empty database.

Instead of populating the database using Drupal's database.4.1.mysql, use the result of step 1 above.

Your installed site should now be identical to the profile site. However, any new content you add will have id's above 1000.

Step 3: Update the Site

Now, you can return to your profile site and make changes. Just run the script as you did in step 1 to generate a database dump.

Apply that database dump to your installed site, and the changes are there!

One More Thing

Also worth mentioning is that the dump generated by my script uses INSERT IGNORE to populate the database. The effect of this is that if data on the site has been modified, the modified data will remain intact. An alternative would be to use REPLACE. In that case, modified data would be overwritten by the data from the install profile.

This, by the way, is why each table in the install profile must have a primary key. Our database dump may be applied many times over the course of developing a site. And each time it is applied it would create duplicate data, were it not for the primary keys.

AttachmentSize
database.mysql.inc.patch.txt860 bytes
drupal_base_db.txt6.29 KB
primary_keys.sql.txt1.2 KB

login or register to post comments


I've updated my code for

Submitted by dave on Fri, 2007-10-05 19:22.

I've updated my code for Drupal 5.x, which I write about here:
http://www.dave-cohen.com/node/1779

And the code is available.


login or register to post comments


Attachment links

Submitted by briwood on Fri, 2008-05-23 00:51.

Hi Dave,

Thanks for writing this up. I'm facing the challenge of setting up dev/qa/prod environments here at Berkeley. I notice your attachment links here are dead. I'll look at the stuff in CVS. I assume the CVS code is more recent than these attachments.

Brian Wood


login or register to post comments