views:

112

answers:

3

I have this database that I'm designing.

It needs to contain a couple dozen tables with records that we provide (a bunch of defaults) as well as records that the user can add. In order to keep the user from shooting himself in the foot, it's necessary to keep him from modifying the default records.

There are lots of ways to facilitate this, but I like the idea of giving protected records negative integer indexes, while reserving 0 as an invalid record id and giving user records positive integer indexes.

CREATE TABLE t1 (
    ixt1  integer AUTOINCREMENT,
    d1    double,
    CONSTRAINT pk_ixt1 PRIMARY KEY (ixt1),
    CONSTRAINT ch_zero CHECK (ixt1 <> 0)
);

-2 | 171.3 <- canned record
-1 | 100.0 <- canned record
 1 | 666.6 <- user record

Reasons this seems good:

  • it doesn't use significantly more space

  • it's easy to understand

  • it doesn't require lots of additional tables to implement

  • "select * from table" gets all the pertinent records, with no additional indirection

  • the canned records can grow in the negative direction, and the user records can grow in the positive direction

However, I'm relatively new to database design. And after using this solution for a little while, I'm starting to worry that using negative indexes might be bad, because

  • Negative indexes might not be supported consistently among different DBMSs, making it difficult to write code that is database-agnostic

  • It might be just too easy to screw stuff up by inserting something at recid 0

  • It might make it hard to use tools (like db grids, perhaps) that expect integer indexes with nonnegative values.

And maybe there are some other really obvious reasons that would make this a Very Bad Idea.

So what's the definitive answer? Are negative integer indexes evil?

+6  A: 

I worked on a very large billing system. We had a very similar problem... needing to mark some records as being "special". The customers had tens of millions of rows of existing data in their databases for the affected table, and it was deemed unacceptable to migrate all of that data to a new structure (i.e. adding a column).

The decision was taken to do exactly what you suggest.

The trouble with that is, you require every bit of business logic to know about (and remember) the special meaning of negative indices and correctly treat it. That's quite error prone (speaking from experience).

Unless you have unusual circumstances that very strongly speak in favor of this non-traditional approach, I suggest you stick with a more traditional extra column. It's what most developers are used to and therefore less likely to cause errors. I wish we would have bitten the bullet and added the extra column.

Eric J.
+1 for "all business logic has to know about the special meaning".
Bob Jarvis
I really appreciate that this is from experience. But it seems to me that adding an extra column would be equivalent, in terms of difficulty. If you had added an extra column instead of using negative indexes, how would your code have been different?
Matthew Lowe
I think the single largest issue is that developers (both now, and the guys in 5 years maintaining this when you are at a better job) expect special meaning to be modeled as a column rather than applying special interpretation to the numeric value of the index. Others have also pointed out that this design cannot be extended if you need additional "special" meanings in the future, so people will have to check the sign of the index AND check a column if there are any new states.
Eric J.
So using this method would make the software less maintainable. I buy this argument too, particularly because you've had to deal with the alternative.
Matthew Lowe
+10  A: 

The most important flaw in this is the "Intelligent Key" problem.

Negative integers work fine as a key. In all databases.

No tool requires positive integer index values.

It's relatively easy to screw this up because the index has a "rule" which isn't obvious and no one will remember after you've won the lottery and left.

Further, when you invent a third status code ('pre-canned' vs. 'customer-specific canned' vs. 'the other canned invented by a product line' vs. 'the old canned before version 3') you're doomed.

The issue with "Intelligent keys" is that you're asking the key to do two unrelated jobs.

  1. It's the unique identifier for a record. That's what an key is supposed to be.

  2. You're also asking it to provide status, control and authorization to change properties. Oops. This is fraught with danger. You can't expand the meaning because it's a single bit buried in a key.

Just add a column with "owned by". If it's owned by "magical super user", then it's not shown to users. Use a VIEW to assure this, if you can't trust your application developers to enforce it.

If it is owned by "magical super user", then it's the default data, and whatever rules apply to that ownership.

S.Lott
+1 for making the key do two jobs. A key is a key, and that's all it should need to be. Make rows special by adding other columns.
Parrots
Bad idea. Negative integer indexes are NOT evil, but your design is not obvious. "Easy to understand", possibly. But immediately obvious from the schema, no.Much better to have a boolean/bit column called is_canned or is_protected.
Brian Pan
+1 for highlighting possibility of 3rd status
MikeD
Anything declared "boolean" in the database is going to morph into multiple values. Boolean should never be looked at as a persistent, descriptive attribute; it's a condition based on an attribute, but not a first-class attribute.
S.Lott
So the essential argument here is that making the key do two things is a bad idea, both because it's not extensible and because it's unexpected. I buy that; it's why I'm trying like heck to use integer keys rather than UUIDs. But it does make me a little sad to just give up and glom on that extra column. Using negative and positive keys is so succinct and subdivides the table so nicely. It seems like it ought to be an elegant solution for /something/. =)
Matthew Lowe
"succinct" == code smell. Be very careful of succinct when the implementation is buried inside another object's value. It's okay when you're Don Knuth writing an algorithm that can be proven to be minimal. But for your problem (which is ill-constrained) you can't show that this "canned"/"user" dichotomy is an enduring feature of the problem.
S.Lott
+1  A: 

It's your data, but I don't think this is a good idea. An 'index' value like this should be meaningless - don't use sign or number range or whatever to mean 'something' or 'something else'. I think that in the long run you'd be much better off having a 'record type' column that indicates clearly what kind of record you're looking at. In my experience this is a much better approach.

Good luck.

Bob Jarvis