views:

146

answers:

9

I came across a question recently that was for "Generating primary key in a clustered environment of 5 App-Servers - [OAS Version 10] without using database".

Usually we generate PK by a DB sequence, or storing the values in a database table and then using a SP to generate the new PK value...However current requirement is to generate primary key for my application without referencing the database using JDK 1.4.

Need expert's help to arrive on better ways to handle this.

Thanks,

+1  A: 

If you cannot use database at all, GUID/UUID is the only reliable way to go. However, if you can use database occasionally, try HiLo algorithm.

Anton Gogolev
As i mentioned above for "Radolpho".. UUID class is available for JDK 5, however I'm looking for a solution in JDK 4
Vicky
A: 

If it fits your application, you can use a larger string key coupled with a UUID() function or SHA1(of random data).

For sequential int's, I'll leave that to another poster.

gahooa
Can you please elaborate? UUID is part of JDK5, but i was looking for a solution in JDK 4. I didn't understand, how are you planning to leverage SHA1?
Vicky
+2  A: 

Use a UUID as your primary key and generate it client-side.

Edit:
Since your comment I felt I should expand on why this is a good way to do things.

Although sequential primary keys are the most common in databases, using a randomly generated primary key is frequently the best choice for distributed databases or (particularly) databases that support a "disconnected" user interface, i.e. a UI where the user is not continuously connected to the database at all times.

UUIDs are the best form of randomly generated key since they are guaranteed to be very unique; the likelyhood of the same UUID being generated twice is so extremely low as to be almost completely impossible. UUIDs are also ubiquitous; nearly every platform has support for the generation of them built in, and for those that don't there's almost always a third-party library to take up the slack.

The biggest benefit to using a randomly generated primary key is that you can build many complex data relationships (with primary and foreign keys) on the client side and (when you're ready to save, for example) simply dump everything to the database in a single bulk insert without having to rely on post-insert steps to obtain the key for later relationship inserts.

On the con side, UUIDs are 16 bytes rather than a standard 4-byte int -- 4 times the space. Is that really an issue these days? I'd say not, but I know some who would argue otherwise. The only real performance concern when it comes to UUIDs is indexing, specifically clustered indexing. I'm going to wander into the SQL Server world, since I don't develop against Oracle all that often and that's my current comfort zone, and talk about the fact that SQL Server will by default create a clustered index across all fields on the primary key of a table. This works fairly well in the auto-increment int world, and provides for some good performance for key-based lookups. Any DBA worth his salt, however, will cluster differently, but folks who don't pay attention to that clustering and who also use UUIDs (GUIDs in the Microsoft world) tend to get some nasty slowdowns on insert-heavy databases, because the clustered index has to be recomputed every insert and if it's clustered against a UUID, which could put the new key in the middle of the clustered sequence, a lot of data could potentially need to be rearranged to maintain the clustered index. This may or may not be an issue in the Oracle world -- I just don't know if Oracle PKs are clustered by default like they are in SQL Server.

If that run-on sentence was too hard to follow, just remember this: if you use a UUID as your primary key, do not cluster on that key!

Randolpho
It appears this works with JDK 1.5, my requirements are for JDK 1.4
Vicky
@apx1sharma: Ahh, I totally missed that constraint in your question. Here are two third party libraries that should do the trick: [JUG](http://jug.safehaus.org/) and [Johann Burkard's UUID library](http://johannburkard.de/software/uuid/)
Randolpho
This was too good explanation. Thank you very much.
Vicky
A: 

You should consider using ids in the form of UUID. Java5 has a class for representing them (and must also have a factory to generate them). With this factory class, you can backport the code to your anticated Java 1.4 in order to have the identifiers you require.

Riduidel
+1  A: 

You may find it helpful to look up UUID generation.

In the simple case, one program running one thread on each machine, you can do something such as

MAC address + time in nanseconds since 1970.
djna
+1  A: 

Take a look at these strategies used by Hibernate (section 5.1.5 in the link). You will surely find it useful. It explains several methods, its pros and cons, also stating if they are safe in a clustered environment.

Best of all, there is available code that already implements it for you :)

Sebastian
These approached need DB to be involved, but I'm looking for an approach without involving database.
Vicky
You could choose UUID algorithm and reuse that algorithm, explained near the same section):"The UUID contains: IP address, startup time of the JVM that is accurate to a quarter second, system time and a counter value that is unique within the JVM. It is not possible to obtain a MAC address or memory address from Java code, so this is the best option without using JNI."
Sebastian
A: 

You can generate a key based on the combination of below three things

  1. The IP address or MAC address of machine
  2. Current time
  3. An incremental counter on each instance (to ensure same key does not get generated twice on one machine as time may appear same in two immediate key creations because of underlying time precision)
Gopi
Yep. I think, we can store a file on each host for pont#3 you suggested. Access to that file can be synchronized to further prevent concurrent access so that each thread can access a number once and only once and before returning from the method, the number can be incremented.
Vicky
A: 

by using Statement Object you can called statement.getGeneratedKeys(); method to retrieve the auto-generated key(s) generated by the execution of this Statement object.

Java doc

Saifuddin
A: 

Here is how it's done in MongoDB: http://www.mongodb.org/display/DOCS/Object+IDs

They include a timestamp.

But you can also install Oracle Express and select sequences, you can select in bulk:

SQL> select mysequence.nextval from dual connect by level < 20;

NEXTVAL

     1
     2
     3
     4
     5
    ..  
    20

Why are you not allowed to use the database? Money (Oracle express is free) or single point of failure? Or do you want to support other databases than Oracle in the future?

TTT