views:

111

answers:

1

I am using a hibernate sequencegenerator to auto-generate unique values for my primary key column.The sample code is given below.

@Entity
@Table(name = "REQUEST")
@javax.persistence.SequenceGenerator(name = "REQ_SEQ", sequenceName = "REQUEST_SEQ")
public class Request {
/**
 * Unique id for this request
 */
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "REQ_SEQ")
@Column(name = "REQ_ID")
private long requestId;
   //So on
}

Everything works fine except the fact that the generated values are interleaved. For example it inserts values from 5000 to 5015(15 inserts) and then the 16th insert produces the value as 5100.Then it works fine for a few subsequent inserts and again the problem. I dont have any problem as long as the generated values are unique but just curious to know what could be causing this. FYI, I am using Oracle.

+9  A: 

Oracle sequences work that way. They only guarantee uniqueness but they do not guarantee consecutive values as it would hamper parallelism.

What they do internally is more or less this: when you request the next value in the sequence, Oracle precomputes a chunk of values (in your case 5000-5099) and puts it in a sequence cache, and then sets seq.nextval = 5100 on disk. But if, due to activity, the db has to discard your chunk of values from the cache, when seq.nextval is accessed the next time, it will take another chunk 5100-5199. That is, Oracle will not even try to save sequence values that have been put into cache.

The point of this is that the sequence cache is a memory structure that is faster and more parallelizable that the sequence itself, which is an on-disk structure. As we want to scale up, we want to avoid going to disk as much as possible.

You can control chunk size for a given sequence using the CACHE clause in your sequence DDL:

CREATE SEQUENCE seq2
CACHE 50;
gpeche
What would be the optimal chunk size. Would a lower chunk size affect the performance? Thanks for the info.
chedine
@gpeche shouldn't ddl commands be left to JPA to be independent from the DB?
stacker
There is no optimal `CACHE` size. Smaller `CACHE` values will affect performance as Oracle has to work with disk/buffer cache more frequently. But the real hit is in parallelism, as getting sequence values from the cache is much more parallelizable than getting them directly from the sequence. The default value for `CACHE` is 20, which tends to be *too low*. You would want it at least in the hundreds for a reasonably busy sequence. But, as always, **measure first**. Also, see http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:6393918681776/thread.jspa?threadID=552069
gpeche
@stacker I thinnk it actually depends on your application. If production environments are going to have DBAs available, I think you should have them do the tuning with ad hoc scripts. This is the case of many enterprise applications. But if the app is going to be deployed "as is", then it should have sensible defaults, and they would fit better as JPA annotations in that case, to make installation as automated and easy as possible.
gpeche
@gpeche +1 good point about DBAs
stacker
I just found that the (inserted)sequence values are unique,but i do note a strange behavior. Rows inserted yesterday have values from 5600-5650 whereas rows inserted today have values from 5500-5550. How come?
chedine
Are you working on RAC? If you inserted yesterday from node A (sequence cache 5600-5650) and today from node B (sequence cache 5500-5550) this would be the case. Oracle has an `ORDER` clause that forces the sequence to be ordered, with a big hit in parallelization, so the default is `NOORDER`. On single machine DBs the sequence should be ordered regardless of this setting.
gpeche