views:

143

answers:

13

NOTICE: Appericiate all the answers, thanks but we already have a sequence.. and can't use the UNIQUE contraints because some items need to have duplicates.. I need to handle this using PLSQL somehow, so based on some criteria (using an if statement) i need to ensure there is no duplicates for that.. And just to confirm, these identifiers are very customized with different types of string.. And for each set of string, we have a number taht counts up only for that string (STR-STR-####) and we have like hundreds of these STR-STR combinations and for each combination we have a #### that counts up... And on top of these some STR-STR combinations are allowed to have duplicates.. So we CAN'T use UNIQUE CONTRAINTS, not can we use PRIMARY KEY as it's not a simple number and on top of that we do have a PRIMARY KEY assigned to the each item.. These Identifiers are for the users and not for the database management.

When a user creates an item, based on some creteria, we give a certain number to the item.. In the save function, a function is called to prepare the first initial string, then the table where the numbers are kept is scanned and the next available 4 digit number for that particular string is assigned.

Now there is some issues with this, there is about 1000 lines of codes between getting the next number and the commit to the database. The issue is when 2 people create an item with the same criteria within seconds of each other, sometimes the same number is issued.

What i've done is that just before the commit, i check the database for the number and if it exist, i call the function to get the next available number again...

Even though this code decreased the chances of a duplictation, if i save 2 items exactly at the same time, i still get a duplicate number..

Anyone know any way to avoid duplicates and drop the chances of a duplicate number to a 0?

EDIT1: We already have a primary key.. This identifier is a special string, and very customized, therefore it woudn't be possible to just use a number that counts up from 0

EDIT2: There is also some cases where we need duplicates.. this is very minor (maybe about 10 different item tracks use duplicates) so before i do my checks for duplicates before the commit with an if statment, so if the item saves dosn't belong to one of the systems that is supposed to have duplicates then, i skip the check...

EDIT 3: We are using PL/SQL here

EDIT 4: I guess this problem was very specific and i didn't quite convey it so.. Although there were many answers, noone of them really captured my problem.. Regardless,i solved the problem and added below as my answer..

+1  A: 

Use a UNIQUE INDEX on that column, or a PRIMARY KEY!

Benoit
+1 Can't agree more
m.edmondson
we already have a primary key... This identifier is very customized, and it's determined based on many factors... I would say we have about 200 different tracks that are counting up (200 different strings+####) The strings are for the users to distinguish their items based on certain creteria.. Therefore can't use a primary key for this as it would just gimme a single number.. that counts up... I'm looking into the Unique Index though
Wonder
Can't use unique index as some of the number need to be duplicates.. See EDIT2
Wonder
+5  A: 

Look up the command CREATE SEQUENCE. Oracle can handle the unique number generation for you. (Virtually) Every database has a way of handling this issue although they differ somewhat in how it's implemented.

[In response to the new requirements edited into the question]

You can still use a SEQUENCE to produce the counting-up part, then combine that with the prefix and save it back to the database. And, in those rare cases in which you need a duplicate ID, don't get a number from the SEQUENCE, just use the one you already have. But SEQUENCE will solve your problem of creating a "well" from which you can draw a guaranteed unique number when needed

Larry Lustig
like i mentioned in the edit.. This isn't a single number that just counts up.. Our identifiers contain different characters plus a number that counts up.
Wonder
See edited answer, above.
Larry Lustig
the counting up number doesn't count up system wide but only for that 'STR-STR-####' combination.. Do you mean that i should create a sequence for each 'STR-STR-####' combination and grab the number for each one of these STR-STR sequences.. That will mean we'll have like 3-400 sequences in our database which will be growing fast...
Wonder
Yes, you use a separate SEQUENCE for each sequence you need. However, if you need to create new sequences based on changes to the _data_ in the system (eg, each category added to the categories table requires a new sequence) you might reconsider your design. Doing the unique-enforcement in code will be somewhat complex and probably require table-locking at some point. If you MUST do it that way, the appropriate place for the code is the INSERT trigger for the table.
Larry Lustig
+1  A: 

Now there is some issues with this, there is about 1000 lines of codes between getting the next number and the commit to the database. The issue is when 2 people create an item with the same criteria within seconds of each other, sometimes the same number is issued.

This is worrying. You should really be using a stored procedure for this and wrap it all in a transaction. You got nothing guaranteeing that any two records will have a different number and will cause problems. But I agree - you need to set a column as a unique identifer and use primary keys - it is a relational database I assume!

m.edmondson
it's actually like 350 lines :) lol
Wonder
lol any amount of lines is still going to be a delay of some sort...
m.edmondson
A: 

Create a PRIMARY KEY column or a UNIQUE KEY column that is auto incremented (either with an AUTOINCREMENT keyword or via SEQUENCE (as Oracle does)). That way, if 2 people add 2 exact data row at the exact same time, the database will add 2 exact values but each with their own unique ID.

The Elite Gentleman
A: 

Yuor code should definitely be replaced with a SEQUENCE. Oracle manages this concurrency very well.

Also, this sequence should be saved in a column that is constrained as UNIQUE.

Randy
+1  A: 

Hi Wonder,

You might need some sort of serialization process to go through this. One path I would recommend is leaving this field blank during the initial insert, wait for the procedure to commit and then have another process (a recurring job for example) fill this column.

This other process would fill all rows according to your business rules. This job would be the only one able to touch this column and you would therefore avoid any concurrency problem.

You could set the job to run every X seconds, this would imply a small delay during which the column would be empty in this setup. Or you could let the initial session launch the update process after the commit with some sort of serialization (a preemptive lock on a row so that no two update process can run at the same time).

Vincent Malgrat
Hi Vincent, thanks for the reply but after the saving, email notifications go out and the number has to be included in the email notifiations, thefore even a few seconds of delay will effect the email notifications
Wonder
@Wonder: the naming process could send the email once it has attributed the correct id :)
Vincent Malgrat
+5  A: 

It sounds like you've denormalised 3 pieces of data into 1 field. This is what your string field currently contains:

  • StringField char (12): STR-STR-####

This is what you really should have (example field names only; it would help to give these meaningful names, but I don't know what your data represents):

  • Str1 char (3): STR
  • Str2 char (3): STR
  • ID int: ####

You can now use a sequence in the ID field.

When you want to get your original string back, you concatenate the contents of the Str, Str2 and ID fields (with dividing hyphens).

In short: Your database design is broken, and you're now paying the price for it. My advice would be to fix the design by normalising your identity field into 3 separate fields. Alternatively, you'll find that you spend hours recreating built-in functionality of the database and end up with a solution that is buggy and has horrible problems with race conditions.


Alternatively, if PL/SQL allows all of this functionality:

Create a table that contains the following fields:

  • Str1 char (3)
  • Str2 char (3)
  • CurrentID int

Then:

  • For each possible combination of STR-STR identifiers, add an entry into the database, with the "CurrentID" value set to 0.
  • Write a stored procedure for retrieving the next ID. It will lock the relevant row based on the STR-STR pair passed in, get the value in CurrentID, increase the value, unlock the row and return the increased value.
  • Call the procedure whenever you need to generate a new ID.

You'll have no concurrency problems, as each attempt to get an ID will have to wait for any other attempts to finish. Each STR-STR pair will have its own counter.

Ant
the second idea is not bad at all... So when i call the stored procedure, it locks that row by default but the changes that i've done to that table doesn't get reflected on it until the calling procedure finishes running, correct? So this might create some errors? How can i commit only changes to that table and not commit the changes i've made in the calling procedure... (we can't commit until the very end because sometimes if an error occurs, we need to roll back all the changes.. Although it will be acceptable not to rollback the changes to the CurredntID table)
Wonder
If you lock a row then any other procedure attempting to read the row waits until the lock is released. So, if Joe and Bob try to get a new ID at the same time: - Joe locks the row; - Bob tries to read the row but can't, so waits; - Joe reads the row; - Joe increases the ID; - Joe unlocks the row; - Bob locks the row; - Bob reads the row; - Bob increases the ID. This will no doubt cause a bottleneck if you have a lot of simultaneous users, but you've really painted yourself into a corner with the current design.
Ant
"If you lock a row then any other procedure attempting to read the row waits until the lock is released" I don't think that is correct, Oracle doesn't have read locks. If the readers are trying to talk their own lock (`select for update`) then they would wait.
Shannon Severance
Yep, you're right. You'd need to use SELECT FOR UPDATE (which is what we're trying to do anyway - select the value and update it), and I think other readers attempting to use SELECT FOR UPDATE would need to wait for the lock to be released before they could continue. I could be wrong; MSSQL and MySQL are my DBs of choice, not Oracle.
Ant
A: 

You could add a table to hold all the identifiers that have been used and need to be unique like this:

create table unique_identifiers (id varchar2(12) primary key);

Then whenever in your 350-line logic you generate a 'STR-STR=####' value that needs to be unique, insert it into that table. If the insert fails, get another number and try again.

Tony Andrews
+1  A: 

This could be done using a function-based unique index. First you'd need to add a column to the table to specify whether the identifier needs to be unique for each row:

alter table ... add (identifier_must_be_unique varchar2(1)
    check (identifier_must_be_unique='Y'));

Now create a unique FBI that only holds the identifiers that need to be unique:

create unique index xxx on yyy 
(case when identifier_must_be_unique='Y' then identifier end);

Finally, in your identifier generation logic, whenever you need the identifier to be unique, set identifier_must_be_unique='Y'; otherwise leave it as null). The FBI will then implement the conditional constraint.

Tony Andrews
A: 

You have a broken data model. But, i'm assuming fixing it isn't a viable option. If you biggest problem is the duplicate keys when you run two quick selects against the table that contains the sequence values (you should use sequences here... but you know that)... then you can try to use the "select ... for update" which will lock the row for the session accessing it. Be careful, this logic may introduce latency into the application.

erbsock
A: 

I may be missing something obvious, but I'd suggest something similar to what someone else suggested earlier. Create a unique constraint on the column such that when you try to persist the record, it will fail if some other process has already used that custom generated ID. You then can trap that exception in the PL/SQL and implement some sort of retry logic until you are able to successfully insert a unique generated ID into the database.

Dave LeJeune
A: 

Well noone of the answers really solved the problem, mostly because i didn't convey the whole situtation properly..

But basically i put my check just before the commit into a loop and updated and committed and checked again within the same loop.. and if still existed, the loop ran again, in which case the check happened.. It's kinda hard to explain but the chances of a duplicate is extremly low (the loop has a limit of 100)

Wonder
This is the "close your eyes and hope for the best" approach to concurrency issues...
Ant
the only way that this would actually produce a duplicate is when the loop hits the limit of 100 because there is like 20 ppl creating an item for that particular STR-STR combo (which is almost impossible) then our support team would get the error email, and correct the duplicate by assigning a new time but this is very highly unlikely
Wonder
+1  A: 

I almost hate to suggest this, but since the approach that you settled on was pretty lousy then the one I'm going to show here is pretty good in comparison. Forgive the sarcasm, but when you have this many users telling you that your approach in general is wrong you ought to take that into consideration.

Somewhere you are probably running a SQL like this:

SELECT MAX(DENORMALIZED_FIELD)
INTO   BADLY_NAMED_VARIABLE
FROM   POORLY_ORGANIZED_TABLE
WHERE  DENORMALIZED_FIELD LIKE 'ABC-XYZ%';

Then later you're probably split the variable using SUBSTR, parse the second part into a NUMBER, increment it, and build a new variable with the new code.

What you can do is add the FOR UPDATE clause to the SELECT statement, placing a lock on the records in question. Sure, you're not actually updating them, but by your definition you have to serialize the operation. This is a non-performant, unscalable, and dirty way to get what you want, but it should work. Check the Oracle docs to see all the implications.

Adam Hawkes
thanks man, no worries about the sarcasm, it's forgiven, i'm on my coop term (finished 2nd year) and i've been dealing with databases for about 5 months now.. I will use this though..
Wonder
Also one question i have is, once we lock the table where I select the max(denormalized_field), and the second request by a user is made, what happens when it tries to select from there? Is it going to give it the MAX(Denormalized_Field) - 1 because it doesn't see the actual MAX row which is locked. Sorry still fairly new to the concept of locking as well
Wonder
Oh, man, a student! Now I feel bad. I hope that once you've graduated you'll end up in a place where schemas like this get refactored into something reasonable. To address your concern, the second user should be waiting on the lock imposed by the `FOR UPDATE`. This is one of those cases where Oracle will wait on a read! That said, you really ought to give this a shot first to confirm. I've not recreated your entire example!
Adam Hawkes
lol it's all good,no need to feel bad..I love my job, i get so involved.. even though i've never thought that i would be working on databases.. Regardless, it would be great if you could answer my previous question, and also I was wondering why you would 'hate to suggest' this
Wonder
I hate to suggest this because, as demonstrated in the other answers, there are a lot of things wrong here in the data model. Fixing these things would make this particular problem simpler / non-existant. Also, locking rows is something to be avoided. Serialization like this is bad for scalability and performance (generally speaking).
Adam Hawkes
I will test it first, database is shitty but we do still have dev schemas :) Anyways I think this should work if oracle will wait on a read and not throw and error
Wonder