views:

85

answers:

6

Problem: When I use an auto-incrementing primary key in my database, this happens all the time:

I want to store an Order with 10 Items. The ordered Items belong to the Order. So I store the order, ask the database for the last inserted id (which is dangerous when it comes to concurrency, right?), and then store the 10 Items with the foreign key (order_id).

So I always have to do:

INSERT ...

last_inserted_id = db.lastInsertId();

INSERT ... INSERT ... INSERT ...

and I believe this prevents me from using transactions in almost all INSERT cases where I need a foreign key.

So... here some solutions, and I don't know if they're really good:

A) Don't use auto_increment keys! Use a key table? Key Table would have two fields: table_name, next_key. Every time I need a key for a table to insert a new dataset, first I ask for the next_key by accessing a special static KeyGenerator class method. This does a SELECT and an UPDATE, if possible in one transaction (would that work?). Of course I would request that for every affected table. Next, I can INSERT my entire object graph in one transaction without playing ping-pong with the database, before I know the keys already in advance.

B) Using GUUID / UUID algorithm for keys? These suppose to be really unique worldwide, and they're LARGE. I mean ... L_A_R_G_E. So a big amount of memory would go into these gigantic keys. Indexing will be hard, right? And data retrieval will be a pain for the database - at least I guess - integer keys are much faster to handle. On the other hand, these also provide some security: Visitors can't iterate anymore over all orders or all users or all pictures by just incrementing the id parameter.

C) Stick with auto_incremented keys? Ok, if then, what about transactions like described in the example above? How can I solve that? Maybe by inserting a Ghost Row first and then doing an transaction with one UPDATE + n INSERTs?

D) What else?

+1  A: 

Sql Server supports SCOPE_IDENTITY (Transact-SQL) which should take care of your transaction issue and concurrency issue.

I would say stick with auto_increment.

astander
+1  A: 

(Assuming you are using MySQL)

"ask the database for the last inserted id (which is dangerous when it comes to concurrency, right?)"

If you use MySQLs last_insert_id() function, you only see what happened in your session. So this is safe. You mention ths:

db.last_insert_id()

I don't know what framework or language it is, but I would assume that uses MySQL's last_insert_id() under the covers (if not, it is a pretty useless database abstraction fromework)

" I believe this prevents me from using transactions in almost all INSERT cases w"

I don't see why. Please explain.

Roland Bouman
+2  A: 

Which database are you using?

Yes, typically inserting a record and then trying to select it again to find the auto-generated key is bad, especially if you are using a naive select max(id) from table query. This is because as soon as two threads are creating records max(id) may not actually return the last id your current thread used.

One way to avoid this is to create a sequence in the database. From your code you select sequence.NextValue then use that value to then execute your inserts (or you can craft a more complex SQL statement that does this selection and the inserts in one go). Sequences are atomic / thread-safe.

In MySQL you can ask for the last inserted id from the execution results which I believe will always give you the correct answer.

Paolo
I guess only Oracle supports these Sequences? I use MySQL - either MyISAM or InnoDB. Probably gonna use InnoDB because of transactions.
openfrog
Lots of DBs support sequences, but given you're on MySQL I would simply use the last_insert_id() as Roland describes.
Paolo
+3  A: 

When storing orders, you need transactions to prevent situations where only half your products are added to the database.

Depending on your database and your connector, the value returned by the last-insert-id function might be transaction-independent. For instance, with MySQL, mysql_insert_id returns the identifier for the last query from that particular client (without being affected by what other clients are doing concurrently).

Victor Nicollet
+1 for "you *need* transactions"
Stefan Steinegger
ok, so when I stick with mysql_insert_id, CAN I still use transactions to insert the Order + 100 Items in one big step, where Items get the FK of the Order as their FK to the order (assuming InnoDB engine)?
openfrog
@openfrog: Yes, just remember to call `mysql_insert_id` the first thing after you inserted a new Order.
Aaron Digulla
+1  A: 

D) Sequence

: may not be available in your DBMS, but if it is, solves your problem elegantly.

For Postgresql, have a look at Sequence Functions

Steve De Caux
+1  A: 

There is no final and general answer to this question.

auto incrementing columns are easy to use when you add new records. To use them as foreign keys within the same transaction, they are not so straight forward. You need database specific commands to get the newly created key. This technology is common for certain databases, for instance sql server.

Sequences seem to be harder to use, because you need to get a key before you insert a row, but at the end its easier to use them as foreign keys. This technology is common for certain databases, for instance oracle.

When you use Hibernate or NHibernate, it is discouraged to use auto incrementing keys, because some optimizations are not possible anymore. Using a hi-lo algorithm which uses an additional table is recommended.

Guids are strong, for instance when sharing data between different databases, systems, disconnected scenarios, import / export etc. In many databases, most of the tables contain only a few hundred records, so memory and performance are not such an issue. When using NHibernate, you get an guid generator which produces sequential guids, because some databases perform better when keys are sequential.

Stefan Steinegger