views:

422

answers:

5

Here's another one I've been thinking about lately. We have concluded in earlier discussions : 'natural primary keys are bad, artificial primary keys are good.' Working with Hibernate earlier I have seen that Hibernate default creates one sequence for all tables. At first I was puzzled by this, why would you do this. But later I saw the advantage that it makes linking parents and children fool proof. Because no tables have the same primary key value, accidentally linking a parent with a table that is not a child gives no results.

Does anyone see any downsides to this approach. I only see one : you cannot have more than 999999999999999999999999999 records in your database.

+3  A: 

Depending on how sequences are implemented in the database, always hitting the same sequence can be better or worse. When only a few or only one thread request new values, there will be no locking issues. But a bad implementation could cause congestion.

Another problem is rolling back transactions: Sequences don't get rolled back (because someone else might have requested a higher value already), so you can have large gaps which will eat your number space much more quickly than you might expect. OTOH, it will take some time to eat 2 or 4 billion IDs (if you "only" use 32 bit (signed) ints), so it's rarely an issue in practice.

Lastly, you can't easily reset the sequence if you have to. But if you need to have a restarting sequence (say, number of records since midnight), you can tell Hibernate to create/use a second sequence.

A major advantage is that you can uniquely identify objects anywhere in the DB just by the ID. That means you can severely cut down the log information you write in the production system and still find something if you only have the ID.

Aaron Digulla
Ref: "2 or 4 billion IDs (if you use 32 bit (signed) ints)", the maximum value for an Oracle sequence is actually 10^27, and an Oracle NUMBER type will store integers up to 38 digits of precision. Should keep us going for a while.
David Aldridge
I means "even if you only used 32bit ints" for an ID. Most people today will use 64bit (LONG or BIGNUM) or more or even UUIDs.
Aaron Digulla
Indeed. My comment was really to point out that Oracle number data types are rather different to other systems. Reference docs here: http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements001.htm#SQLRF00222
David Aldridge
Running out of numbers shouldn't be an issue if you use NUMBER. Running at a trillion transactions/sec, you are good for 31.6 million millenia.
Mark Harrison
+8  A: 

There could be performance issues with all code getting values from a single sequence - see this Ask Tom thread.

Tony Andrews
+1 Really useful link thanks
carpenteri
In our case when we increased the cache of seqeunces significantly the performance of single sequence and multiple sequence was similar .
Edwards
Ahh you might also want to scroll down on that thread and look at Mr Mark Bobaks reply it mirrors my experiences
Edwards
@Edwards thanks for pointing that nugget out, very interesting indeed.
David
+1  A: 

There are a couple of disadvantages of using a single sequence:-

  • reduced concurrency. Handing out the next sequence value involves synchronisation. In practice, I do not think this is likely to be a big problem
  • Oracle has special code when maintaining btree indexes to detect monotonically increasing values and balance the tree approriately
  • The CBO might have a better time estimating range queries on the index (if you ever did this) if most values were filled in

An advantage might be that you can determine the order of inserts amongst different tables.

WW
+2  A: 

I prefer having one sequence per table. This comes from one general observation: Some tables ("master tables") have a relatively small row count and have to be kept "forever". For example, the customer table in an ERP.

In other tables ("transaction tables"), many rows are generated perpetually, but after some time, those rows can be archived (or simply deleted). The most extreme example is a tracing table used for debugging purposes; it might grow by hundreds of rows per second, but each row is obsolete after a few days.

Small IDs in the master tables make it easier when working directly on the database, e.g. for debugging purposes.

select * from orders where customerid=415

vs

select * from orders where customerid=89461836571

But this is only a minor issue. The bigger issue is cycling. If you use one sequence for all tables, you simply cannot let it restart. With one sequence per table, you can restart the sequences for the transaction tables when you have archived or deleted the old data. Master tables hardly ever have that problem, since they grow much slower.

I see little value in having only one sequence for all tables. The arguments told so far do not convince me.

ammoQ
+1  A: 

Certainly there are pros and cons to the one-sequence versus one-sequence-per-table approach. Personally I find the ability to assign a truly unique identifier to a row, making each id column a uuid, to be enough of a benefit to outweigh any disadvantages. As Aaron D. succinctly writes:

you can uniquely identify objects anywhere in the DB just by the ID

And, for most applications, due to the way Hibernate3 batches IMPORT statements, this will not be a performance bottleneck unless massive amounts of records are vying for the same db resource (SELECT hibernate_sequence.nextval FROM dual).

Also, this sequence mapping is not supported in the latest release (1.2) of Grails. Though it was supported in Grails 1.1 (!). It now requires subclassing one of the Hibernate dialect classes as a workaround.

For those using Grails/GORM, have a look at this JIRA entry:

Oracle Sequence mappings ignored

mikesalera
Workaround (and Java code) are summarized here --> http://stackoverflow.com/questions/2564491/does-overloading-grails-static-mapping-property-to-bolt-on-database-objects-vio
mikesalera