views:

107

answers:

7

I have a multithreaded process which inserts several records into a single table. The inserts are performed in a stored procedure, with the sequence being generated INTO a variable, and that variable is later used inside of an INSERT.

Given that I'm not doing mysequence.nextval inside the INSERT itself, it makes me think that it is possible for two concurrent processes to grab a sequence in one order, then do the inserts in the reverse order. If this is the case, then the sequence numbers will not reflect the true order of insertion.

I also record the sysdate in a DATE column for each of my inserts, but I've noticed that often times the dates for two records match and I need to sort by the sequence number to break the tie. But given the previous issue, this doesn't seem to guarantee the actual insert order.

How can I determine the absolute order of insertion into the database?

+5  A: 

DATE datatypes only go to seconds, whereas TIMESTAMP goes to milliseconds. Would that address the problem?

According to Oracle's docs:

TIMESTAMP: Year, month, and day values of date, as well as hour, minute, and second values of time, where fractional_seconds_precision is the number of digits in the fractional part of the SECOND datetime field. Accepted values of fractional_seconds_precision are 0 to 9. The default is 6. The default format is determined explicitly by the NLS_DATE_FORMAT parameter or implicitly by the NLS_TERRITORY parameter. The sizes varies from 7 to 11 bytes, depending on the precision. This datatype contains the datetime fields YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND. It contains fractional seconds but does not have a time zone.

Whereas date does not:

DATE: Valid date range from January 1, 4712 BC to December 31, 9999 AD. The default format is determined explicitly by the NLS_DATE_FORMAT parameter or implicitly by the NLS_TERRITORY parameter. The size is fixed at 7 bytes. This datatype contains the datetime fields YEAR, MONTH, DAY, HOUR, MINUTE, and SECOND. It does not have fractional seconds or a time zone.

Of course, having said that, I am not sure why it matters when the records were written, but that is a way that might solve your problem.

MJB
I probably should have worded it a little differently. I guess what I mean is, how can I determine in which order the insert statements were executed, rather than the order of the physical writes to the database files. I think going down to milliseconds will disambiguate things for me.
RenderIn
Actually, TIMESTAMP can go down to the nanosecond. It depends on the operating system.
APC
A: 

There are several effects going on. Todays computers can execute so many operations per second that the timers can't keep up. Also, getting the current time is a somewhat expensive operation so you have gaps that can last several milliseconds until the value changes. That's why you get the same sysdate for different rows.

Now to solve your insert problem. Calling nextval on a sequence is guaranteed to remove this value from the sequence. If two threads call nextval several times, you can get interleaved numbers (i.e. thread 1 will see 1 3 4 7 and thread 2 will see 2 5 6 8) but you can be sure that each thread will get different numbers.

So even if you don't use the result of nextval immediately, you should be safe. As for the "absolute" insert order in the database, this might be hard to tell. For example, a DB could keep the rows in a cache before writing them to disk. The rows could be reordered to optimize disk access. But as long as you assign the results from nextval to your rows in the order in which you insert them, this shouldn't matter and they should always appear to be inserted in order.

Aaron Digulla
A: 

While there may be some concept of insertion order in to a database, there is certainly no concept of retrieval order. Any rows that come back from the database will come back in whatever order the DB sees fit to return them in, and this may or may not have ANYTHING to do with the order they were inserted in to the database. Also, the order that rows are inserted in to the DB may have little to nothing related to how they are physically stored on disk.

Relying upon any order from a DB query without the use of an ORDER BY clause is folly. If you wish to be certain of any order, you need to maintain that relationship at a formal level (sequences, timestamps, whatever) in your logic when creating the records for insertion.

Will Hartung
Sorry, I may not have been clear enough in my question. I do have an ORDER BY on the date column, but since several records have the same date value I was also ordering by the sequence ID.
RenderIn
A: 

If the transactions are separate, you can determine this from the ora_rowscn pseudo-column for the table.

[Edit] Some more detail, and I'll delete my answer if this is not of use - unless you created the table with the non-default "rowdependencies" clause, you'll have other rows from the block tagged with the scn, so this may be misleading. If you really want this information without an application change you'll have to rebuild the table with this clause.

dpbradley
+1  A: 

Sequence should be thread safe:

create table ORDERTEST (
    ORDERID number not null ,
    COLA   varchar2(10) ,
    INSERTDATE date default sysdate,
    constraint ORDERTEST_pk primary key (orderid)
) ;

create sequence ORDERTEST_seq start with 1 nocycle nocache ;

insert into ORDERTEST (ORDERID, COLA, INSERTDATE)
                select ORDERTEST_SEQ.NEXTVAL , substr(OBJECT_NAME,1,10), sysdate 
                  from USER_OBJECTS 
                 where rownum <= 5; --just to limit results

select * 
  from ORDERTEST
 order by ORDERID desc ;

 ORDERID                COLA       INSERTDATE                
---------------------- ---------- ------------------------- 
5                      C_COBJ#    16-JUL-10 12.15.36        
4                      UNDO$      16-JUL-10 12.15.36        
3                      CON$       16-JUL-10 12.15.36        
2                      I_USER1    16-JUL-10 12.15.36        
1                      ICOL$      16-JUL-10 12.15.36  

now in a different session:

insert into ORDERTEST (ORDERID, COLA, INSERTDATE)
            select ORDERTEST_SEQ.NEXTVAL , substr(OBJECT_NAME,1,10), sysdate 
              from USER_OBJECTS 
             where rownum <= 5; --just to limit results

select * 
  from ORDERTEST
 order by ORDERID desc ;

 5 rows inserted
ORDERID                COLA       INSERTDATE                
---------------------- ---------- ------------------------- 
10                     C_COBJ#    16-JUL-10 12.17.23        
9                      UNDO$      16-JUL-10 12.17.23        
8                      CON$       16-JUL-10 12.17.23        
7                      I_USER1    16-JUL-10 12.17.23        
6                      ICOL$      16-JUL-10 12.17.23     

The Oralce Sequence is thread safe: http://download.oracle.com/docs/cd/B19306_01/server.102/b14231/views.htm#ADMIN020 "If two users are accessing the same sequence concurrently, then the sequence numbers each user receives might have gaps because sequence numbers are also being generated by the other user." the numbers may not be 1,2,3,4,5 (as in my example --> if you fear this you can up the cache)

this can also help, although they do not site their source: http://forums.oracle.com/forums/thread.jspa?threadID=910428 "the sequence is incremented immediately and permanently, whether you commit or roll back the transaction. Concurrent access of NextVal on a sequence will always return separate values to each caller."

If your fear is the inserts will be out of order and you need the sequence value use the returning clause:

declare 
x number ;
begin 
insert into ORDERTEST (ORDERID, COLA, INSERTDATE)
                values( ORDERTEST_SEQ.NEXTVAL , 'abcd', sysdate)
                returning orderid into x;

dbms_output.put_line(x);
end;

--11

then you know it got inserted right then and there.

tanging
A: 

Given your description of the issue you're trying to resolve, I would think that the sequences would be fine. If you have two processes that call the the same stored procedure and the second (chronological) one finishes first for some reason is that actually relevant? I would think the order in which the procedures were called (which will be reflected by the sequence (unless you're using RAC)) would be more meaningful than the order in which they were written to the database.

If you're really worried about the sequence the rows were inserted in, then you need to look at when the commits were issued, not when the inserts statements were issued. Otherwise you have the following scenario as a possibility:

  1. Transaction 1 is started
  2. Transaction 2 is started
  3. Transaction 3 is started
  4. Transaction 2 inserts
  5. Transaction 1 inserts
  6. Transaction 3 inserts
  7. Transaction 3 commits
  8. Transaction 1 commits
  9. Transaction 2 commits

In this case Transaction 1 was started first, Transaction 2 inserted first, and Transaction 3 committed first. The sequence number gives you a good idea of the order in which the transactions were started. A timestamp field will give you an idea of when the inserts were issued. The only reliable way to get an order for the commits is to serialize writes to the table, which is generally a bad idea (it removes scalability).

Allan
A: 

You should (a) add the timestamp to each record, and (b) move the sequence NEXTVAL to the INSERT statement.

That way, when you query the table, you can ORDER BY timestamp, id, which will effectively be the order in which the rows were actually inserted.

Jeffrey Kemp