views:

87

answers:

5

I am designing a laboratory database. Several products, samples, etc are identified by a composite number with multiple parts which indicate different values such as: origin, date, type, id today, etc. Examples of composite numbers might include a driver's license number (X44-555-3434), lot number (XBR-A26-500-2).

How should composite numbers be stored in a database? Should they be stored as a string or should each component of the composite number be stored (or derived) separately?

NOTE: Use Oracle if the question cannot be answered generally.

+3  A: 

It depends on what you eventually want to query. If you want to look for parts of the string separately, save it separately. If you want to look for the whole code (with or without the dashes, that can be dealt with), store it as a single string.

Sinan Taifour
+1  A: 

Also, if you decide to store them in separate columns, it would be a good idea to make a unique key out of them (unique key containing all the columns).

Jon
+1  A: 

I would store each coded piece of the composite number separately, possibly as the actual value the code is meant to represent (i.e. instead of storing "XBR" store "January 12, 2007" as a timestamp). But that would depend on whether you expect to look up items by coded composite numbers more often, or by their actual semantics.

Then I would keep the mapping between the codes and the actual values somewhere in the database as well. So I'd have a few small tables that store two columns each, one representing codes and one representing, for example, timestamps.

jprete
+8  A: 

In my experience, if there are elements of a string that have meaning, it's best to put them in their own field. The gymnastics we go through trying to tease out the meaning are complex and error-prone; also it is easier to do data validation when we deal with each field explicitly. It's easy to construct the composite string, easy to search on the constructed string (albeit sometimes difficult to index). Fine-grained storage has always worked best for me.

Carl Manaster
+1, spoken like a true data analyst ;-)
DCookie
+1, I've seen some pieces of code that did not adhere to this store-the-pieces principle, and it was always regretted.
Rob van Wijk
+1  A: 

Is the derivation of these numbers likely to change and how would expect that to impact the application ?

For example, years ago car registration plates in Britain started with three letters (which I think indicated a region), followed by three digits, followed by another letter which indicated the year of registration.

Eventually they ran out of letters for the suffix, and switched the ordering round so rather than having "ABC 123 A", you may have "A 123 ABC", and it would have been possible to have two cars registered at the same time, one with "ABC 123 A" and the other "A 123 ABC". Oh, and then they 'sped' up the use of the single letter year indicator (as lots of people were waiting until the new letter came into effect before buying a car), so it not longer indicated a year.

If you were just interested in registration plates, then you would have been best off storing the registration plate as one value. However the application responsible for issuing the numbers probably needed them broken down into their components.

Gary