Drupal 7 User IDs - Not Consecutive

Putting together a new Drupal site, I had a puzzling moment when I noticed a new user's ID was number 15. This was mysterious because the site had only 4 users on it. And just moments before I had created a user who's ID was 13. ID number 14 went missing, and I couldn't find it anywhere in the database.

I asked around, and eventually found a thread with the answer: Provide a sequences API. It's long so I'll summarize... In Drupal 5, there was a database table devoted to generating ID sequences. In Drupal 6, they got rid of that in favor of auto-incrementing IDs. In Drupal 7, they brought back the sequence API. Drupal uses this new code for the user IDs, while node IDs (and many other IDs) are still auto-increment. (Although possibly not for long - don't ask me why they dislike auto-increment so much). [Edit - removed italics around "they".]

To get to the point, some tables behave as I would have expected, i.e. node IDs are continuous (for now). But, Users share their IDs with other parts of the system. In Drupal core, the same set of ID numbers is shared between users, actions and batch jobs. Third-party modules may introduce even more things that take from this shared space. The case that bothers me most is the batch jobs. Every batch job performed by Drupal will increment the ID sequence. Even the batch jobs that complete in a single page load; they take an ID even if they are never stored in the database!

I prefer consecutive user IDs. And I expected them to be consecutive, because I have experience with Drupal 6.x. To the Drupal core developers, this makes me a "client who thinks they know more then [sic] they actually do." Sounds like we can't turn to the Drupal core developers for help on this one.

Those developers who claim to think to know exactly how much they know :-) they haven't taken away all our options. At least not yet. It turns out the user_save() function is OK with us specifying a UID. [Edit - It looks like the patch above makes this possible, so I guess they gave us an option after all!] So we can override the default behavior of Drupal core. Here's how.

The code below implements several Drupal hooks. You'll need to replace custom everywhere you see it with the name of your own module or profile.

The code below has changed since I first posted here. The code below may have problems if you've customized the user creation behavior in other ways. It should work better than my previous code on most Drupal instances.

First, in our module/profile .install file we create a table like Drupal core's {sequences} table. Our table is named custom_uid_sequences and will be used only when generating new UIDs. To use this code, add it to your .install file and run Drupal's update.php script.

<?php
/**
* Implements hook_schema().
*
* Adds a table just like core sequences table.  We will use it when creating users.
*/
function custom_schema() {
 
$system_schema = system_schema();

 

$custom_uid_sequence = $system_schema['sequences'];
  return array(
'custom_uid_sequences' => $custom_uid_sequence);
}

/**
* Installs our custom schema, including table for user IDs.
*/
function custom_update_7001() {
 
drupal_install_schema('custom');
}
?>

After we've created this table, we can tell Drupal to use this table in place of {sequences} when creating users. And tell Drupal to use the regular table after the user creation is complete. These hooks belong in your custom .module or .profile file.

<?php
/**
* Implements hook_entity_presave().
*
* Before saving a user, we use a database connection that prefixes the
* {sequences} table with 'custom_uid_'.  This is to make user IDs come
* from a consecutive sequence of numbers, rather than skipping numbers
* when batch jobs are performed or other Drupal behavior uses the
* {sequences} table.  This approach should work OK as long as the user
* create process has not been customized to start batch jobs, create
* actions, or do anything else that relies on {sequences}.
*/
function custom_entity_presave($entity, $type) {
  if (
$type == 'user' && empty($entity->uid)) {
   
// We're saving a new user.

   

$active_key = 'default'; // Drupal provides no way to learn active key, so we assume default.

    // Create a new database connection, identical to the active one except adding a prefix for the sequences table.
   

$db_info = Database::getConnectionInfo($active_key);
   
$db_data = $db_info[$active_key];
   
$db_data['prefix']['sequences'] = 'custom_uid_';

   

Database::addConnectionInfo('custom_uid', 'default', $db_data); // Not sure what a $target is.  'default' is only value that works.

   

$GLOBALS['_custom_db_active_key'] = Database::setActiveConnection('custom_uid');
   
// Until custom_entity_insert() is called, all Drupal sequences will come from the custom_uid_sequences table.  Hopefully it will be used only to create the new user.
 
}
}

/**
* Implements hook_entity_insert().
*
* If custom_entity_presave() changed the active DB connection, we restore it's previous setting here.
*/
function custom_entity_insert($entity, $type) {
  if (
$type == 'user' && isset($GLOBALS['_custom_db_active_key'])) {
   
// During user insert, we used custom_uid_sequences.  Now we return to normal sequences table.
   
Database::setActiveConnection($GLOBALS['_custom_db_active_key']);
    unset(
$GLOBALS['_custom_db_active_key']);
  }
}
?>

The preceding code may not work for you. It may not work with future versions of Drupal. It comes with no warranty expressed or implied. Still, I hope you like it and find it helpful.

If you feel like I do, that it is worth a little extra effort to keep numerical IDs consecutive, please chime in on the issues above and/or leave a comment here.

Tags:

Comments

Seeing as nobody else has replied, so I figured I would.

You talk about the core maintainers as if they are some secret club plotting the contortion of your beloved toy. Core maintainers are just people who contribute in the core issue queues, simple as that - you can be a "core maintainer" by fixing spelling mistakes (http://drupal.org/node/1742958).

In reality *this* decision was initially meant as a cleanup of existing code but in the interest of solving other problems (e.g. http://drupal.org/node/204411#comment-1185980) it ended up being extended to cover users too.

What I suggest you do quit the complaining, accept that not everything is built for *your* use case, and release your change as a contrib module - I'm sure there are lots of people who could use it.

Dave Cohen's picture

You talk about the core maintainers as if they are some secret club plotting the contortion of your beloved toy. Core maintainers are just people who contribute in the core issue queues...

I'm aware of the process. In fact, I've tried and I've tried to get changes into core. I've even succeeded, a time or two.

What I suggest you do quit the complaining, accept that not everything is built for *your* use case...

Glad you said that, because it brings me to my big complaint about Drupal. I got into Drupal for its flexibility (wasn't looking for a toy to love). It's APIs (seemed to me at the time) made it completely customizable. So that even if it was not built for my use case, it could be made to fit my use case. Or my client's use case.

Personally, I think this should be Drupal's primary goal: remain flexible enough to be the right choice for any website use case. Your view that it should be limited seems to be winning lately.

What exactly is Drupal's use case, if not mine?

Hi David,

I'm wondering what your trying to accomplish here. Reading through your post I see a lot of anger at a topic...that...doesn't seem worth it. Also, many statements I'd expect from a political handout and not a technical blog posting. For exampe:

"they brought back the sequence API" -- 'they'

"don't ask me why they dislike auto-increment so much" -- Perhaps you should know why they dislike auto_increment in this case? Did you read the issue and try to figure out why this is done?

'To the Drupal core developers, this makes me a "client who thinks they know more then [sic] they actually do." Sounds like we can't turn to the Drupal core developers for help on this one' -- This is just childish. If you read the issue, this is brought up once to explain a case you clearly don't fit. Your concerned about batch jobs being in the same 'ID space' as users, which Crell was as well and was discussed in that thread. The phenomenon of a client having a little knowledge/info and using it to cause trouble or block something because they don't understand the full story is very widely known and trying to use it as an argument against core developers in this issue is confusing.

'"Those developers who claim to think to know exactly how much they know"
Those developers who claim to think to know exactly how much they know :-) they haven't takes away all our options. At least not yet. ' -- LOL. Seriously, have you worked in politics before? The creeping menace of the core developers taking away your hooks....and...your choices.

So, lets recap. You:

* Use a bunch of divisive language and statements for no apparent reason
* Get offended by a simple statement that is very widely known and _should_ be known to you too
* At no point give a technical reason as to why you don't like this or what problems it could cause.
* Didn't continue your issue when you asked about this, just saw the original thread, got pissed off and closed it.
* Are giving the impression that the core developers are a cabal who are trying to take away your choices and introduce new systems that will make your sites worse (although you never explain how in this case) so they have something to joke about in their midnight meetings....

So, I'd like to ask: What are you trying to accomplish here? I feel like this blog posting shouldn't exist and you should have kept your original issue open to get a response/debate why you feel like this is a bad idea.

Dave Cohen's picture

I'll start with a couple thoughts, for the record...

I'm quite grateful to Drupal contributors. Over the years, I've built a lot with Drupal, and I know the collective knowledge makes a better product than I'd be able to produce on my own.

My intent was to be tongue in cheek, while also expressing some dissatisfaction. I can tell from the comments that the tongue-in-cheeckness was lost along the way. One smiley was clearly not enough.

[Disclaimer] I have a history of trying to get changes in Drupal core. I've seen my proposed patches go down in flames when a few people objected with, essentially, "I don't imagine I'll ever do that with my site." I strongly feel that Drupal should remain flexible. It should be a goal to give complete control to a site administrator. I feel like Drupal has made some changes for the worse over the years. And while I didn't intend for this post to cover any of that, my history affects the way I see some things.

And now some specific replies...

I'm wondering what your trying to accomplish here.

Mostly, to share a snippet of code that I think others will be interested in. In the original thread, Crell points out that some Drupalers prefer consecutive IDs. I'm one. In the original thread, folks like me are SOL. In my post, I offer a snippet that may help some users.

Get offended by a simple statement that is very widely known and _should_ be known to you too

I'm not sure what should be known to me. Is the line about clients who think they know a common expression, or should I know that IDs are not consecutive?

That's known to me now, after spending valuable time researching. When my site had 4 users, but ID as high as 15, I was puzzled and actually thought something was wrong that needed research. I'm not a Drupal newbie. I know some history about the move to auto-increments in D6. I was not aware of the move away from them in D7. I literally thought, "oh maybe all entity IDs come from the same ID space". That's not the case. User IDs currently behave differently from all other IDs. I should know this??? How exactly am I supposed to know that? Reading every core issue?

I've had good conversations in person with Crell, David Strauss and others on that thread. I like them personally and defer to their expertise almost always. Still when they refer to me (albeit not me specifically) as a client who thinks he knows more than I actually do, I take some umbrage. I think its reasonable to think that if user IDs are consecutive in D6, they would also be D7. Do you honestly think I'm the only one out there who thinks that?

Didn't continue your issue when you asked about this, just saw the original thread, got pissed off and closed it.

I was seeking an answer and when I found it, I thought save everyone some time and close it. I don't get the impression, from the original issue, that there is any room for further debate. It's been decided. If you think there will be debate, I invite you to open that issue again, I will post there for sure.

The original thread is long. I've tried to understand it, and I come away not understanding why the users table is no longer auto-increment, and also not understanding why other tables, in the near future, should not be auto-increment. That thread clearly shows the intention to make that change in the next release of Drupal, but doesn't explain the problem that would solve. Even if all entity IDs came from the same number space, I would argue that batch jobs, not ever saved to database, should not take the same numbers.

I recognize that I'm not a typical Drupal user. I'm stubborn about my development environment and like to do things the way I think is best. I use, for example http://drupal.org/project/site_update. And using that, I pay close attention to the database IDs.

I started using Drupal years ago because it was flexible, if you were willing to write a little code. Over the years, it's become a lot less flexible for coders. Sure, it's become possible to set up complex sites by filling out forms in your web browser, and that's great. But it's getting harder and harder to do simple things with a snippet of code, and that's a shame. Here I am sharing a snippet of code that does something I like. (I will be surprised if the snippet works in Drupal 8, but for now I'm using it and sharing it.)

I'm also blogging some of my dissatisfaction with some recent changes to Drupal. I'm OK with you disagreeing with me here and/or on drupal.org. I hope you're OK with me blogging my opinions.

Hi David,

"My intent was to be tongue in cheek, while also expressing some dissatisfaction. I can tell from the comments that the tongue-in-cheeckness was lost along the way. One smiley was clearly not enough."

Thats always tough. I've read your posts before and the italics here were confusing. The emphasis could have been humor, but was interpreted by myself and others otherwise. This post makes more sense and fits better with the other posts of yours I've read when assuming its humor.

"I'm not sure what should be known to me. Is the line about clients who think they know a common expression, or should I know that IDs are not consecutive?"

The line about clients, I wouldn't expect anyone to be following the sequences API closely :). My concern was that you seemed genuinely upset about the 'clients who think they know more than they do' line applying to you, when I don't believe it does at all and I think it was targeted at a certain type of client. (Who would see the ID offsets and not research, but just assume something was broken. You had the correct reaction to seeing something different like that....so I really don't believe this in any way was meant to be about you)

"If you think there will be debate, I invite you to open that issue again, I will post there for sure."

I likely am going to re-open this issue, as I have another problem with this change.

"The original thread is long. I've tried to understand it, and I come away not understanding why the users table is no longer auto-increment, and also not understanding why other tables, in the near future, should not be auto-increment. That thread clearly shows the intention to make that change in the next release of Drupal, but doesn't explain the problem that would solve. Even if all entity IDs came from the same number space, I would argue that batch jobs, not ever saved to database, should not take the same numbers."

From reading that thread, I don't believe every table is going to transition to this new sequences API. The idea is this: We needed a sequences API for a few things, so one was written. The users table has an issue with auto_increment (during install we get into a really weird situation where 0 has to be inserted and its a pain...the logic has broken on several releases...its just brittle). Thus, this API was used for the users table. There are two ways to do a sequences API, each table has its own "sequence" or one shared "sequence", basically making this a unique number generator. In essence this sequences API is removing the concept of incrementing numbers in the PK of a table using it. Everything just gets an identifying number that may or may not be related to the number of the item before.

"Even if all entity IDs came from the same number space, I would argue that batch jobs, not ever saved to database, should not take the same numbers."

But why? I feel like this conversation would be very different if you had lead with. "This change will prevent me from doing A, B, C".

"Over the years, it's become a lot less flexible for coders."

I'd definitely argue the opposite. What flexibility has gone away from your perspective? I am interested in this view as I've heard it before, but I have not been able to talk to someone to find out details on what has gone away.

"I'm also blogging some of my dissatisfaction with some recent changes to Drupal. I'm OK with you disagreeing with me here and/or on drupal.org. I hope you're OK with me blogging my opinions."

Of course. I've read your posts before, I was surprised by this one. That the humor didn't translate explains a lot. What remains is that I feel you do have a legitimate gripe with consecutive uids going away and the impact of that on a site admin, but that your point was done a disservice by its representation.

Dave Cohen's picture

Thanks for coming back for another round of comments. I didin't want to leave the impression I was bad-mouthing a few core contributors behind their backs. Cause that's not what I'm up to.

My concern was that you seemed genuinely upset about the 'clients who think they know more than they do' line applying to you, when I don't believe it does at all and I think it was targeted at a certain type of client.

I get that, and understand there's a certain type of client out there. In this case, I squarely fit the description, so intended or not, those comments are about me. I suspect we all think we know more than we know at times. And here's the irony... I think those guys posting that line were at that exact moment thinking they know more than they do. I think over time there will be more an more support issues and complaints about the UIDs.

From reading that thread, I don't believe every table is going to transition to this new sequences API.

I believe chx is gonna work on it.

The users table has an issue with auto_increment (during install we get into a really weird situation where 0 has to be inserted and its a pain...the logic has broken on several releases...its just brittle).

So, the UID number is not important, except when it is. :)

"Even if all entity IDs came from the same number space, I would argue that batch jobs, not ever saved to database, should not take the same numbers."

I'm building a site where most nodes when created trigger a batch job. I can imagine hundreds maybe thousounds of batch jobs per user created. You might consider it just aesthetic, but I don't want these huge gaps between user IDs or any other IDs.

On the other hand, I'd actually like to see Drupal use GUIDs, instead of numeric IDs. But as long as we're using numbers, those mysterious gaps in the IDs just feels broken to me.

Dave Cohen's picture

"Over the years, it's become a lot less flexible for coders."

I'd definitely argue the opposite. What flexibility has gone away from your perspective? I am interested in this view as I've heard it before, but I have not been able to talk to someone to find out details on what has gone away.

This is off-topic and deserves a thread of its own. Still I have one in mind I'll try to quickly describe.

Let's say my node has an image field. Now, show me a snippet of code that returns the URL of the image.

In old drupal, I would figure this out by viewing the node, and viewing source in my browser. I'd find the markup where the image was displayed, and key off the tag or ID or something in the source. I'd grep for that thing in the code, find a theme function or what not. That function would have the answer, or maybe at least tell me what module to look at.

But in modern drupal, everything rendered comes from some array built somehow, somewhere. Data is stored via Field APIs that are really hard to grok. Its like they're trying to set records for deepest nested data structures and levels of indirection in code.

And the worst part about modern drupal is when you ask, "how can I learn the URL of an image?" The answer is not "here's how..." but instead "you're doing it wrong!" "You don't need the URL, because a formatter handles the field for you," they say as they walk away. Well, in my case, I'm trying to make a call into Facebook's API and it needs a URL (not really an unusual thing to ask for).

When it comes to open source software like Drupal, I don't like the "you're doing it wrong" answer. I would rather hear, "it should be able to do that, and if it can't let's change it." And that my friend is no longer the Drupal way.

Dave Cohen's picture

I finally figured out a snippet to do this, below. Just change the 'field_images' to whatever field name you set up.

<?php
   
// Given a node, find an image URL.
   
if ($images = field_get_items('node', $node, 'field_images')) {
      if (
$file = file_load($images[0]['fid'])) {
        if (
$url = file_create_url($file->uri)) {
         
$pic_path = $url// This is the URL for the image.
       
}
      }
    }
?>

Although I said this on my blog, in my opinion ;) forming an opinion on a rather challenging technical issue is not an opinion. This belongs to the core queue and the core queue alone. See the comment on my blog for more.

> As for the other rant, Let's say my node has an image field. Now, show me a snippet of code that returns the URL of the image.

Certainly. Write a little formatter module which merely emits that URL. If you look into image_field_formatter_view it's fairly trivial to see which parts need copying to have the URI and then use #theme link. Then, you can run field_view_field and render the results. You could obviously do a lot less clean ways -- call file_load_multiple with the file ids stored in the file, call file_create_url on the results. So... what's your problem?