views:

24

answers:

3

Hi I run a small site and use PostgreSQL 8.2.17 (only version available at my host) to store data. In last few months there were 3 crashes of database system on my server and every time it happened 31 ID's from serial field (primary key) in one of the tables were missing. 93 ID's are missing now. Table:

CREATE TABLE "REGISTRY"
(
  "ID" serial NOT NULL,
  "strUID" character varying(11),
  "strXml" text,
  "intStatus" integer,
  "strUIDOrg" character varying(11),
)

it is very important for me that all the ID values are there. What can I do to to solve this problem?

Sorry for my bad English.

A: 

Are you missing 93 records or do you have 3 "holes" of 31 missing numbers?

A sequence is not transaction safe, it will never rollback. Therefor it is not a system to create a sequence of numbers without holes.

From the manual:

Important: To avoid blocking concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back; that is, once a value has been fetched it is considered used, even if the transaction that did the nextval later aborts. This means that aborted transactions might leave unused "holes" in the sequence of assigned values. setval operations are never rolled back, either.

Frank Heikens
I have 3 "holes" of 31 missing numbers.I use transactions only for updates. Inserts are made one at a time - when user submits a form so it can't be a transaction related problem.
Kayo
Well, some proces did use the sequence to get a new number. Now you have some holes. But what's real problem? An id doesn't have any meaning, it's just a pointer to a unique record.
Frank Heikens
ID is used for statistics and payments. I have no problems (everything works fine) with this issue but it is a very big problem for my boss and our client :(
Kayo
You can't use a sequence for this. You could upgrade to version 8.4 and use ROW_NUMBER() in your SELECT statements. http://www.postgresql.org/docs/current/static/functions-window.html Another option would be to lock the table, get the MAX(id) and use this for your next id. But this might be slow.
Frank Heikens
From src/backend/commands/sequence.c in the PostgreSQL source: "/* We don't want to log each fetching of a value from a sequence, so we pre-log a few fetches in advance. In the event of crash we can lose as much as we pre-logged. */ #define SEQ_LOG_VALS 32"
Matthew Wood
@Matthew: That's a parameter in the sequence definition, the CACHE: http://www.postgresql.org/docs/current/static/sql-createsequence.html
Frank Heikens
@Frank: No, that's different. This has to do with the WAL logs and making sure that when recovering after a crash that you don't generate a number again while at the same time not having to log every single generated number, hurting performance. There's another comment later in that source file that goes into more detail. The fact that it's 31 each time points to this as the reason (fetch 32 and one gets used as the trigger for the fetch).
Matthew Wood
+1  A: 

You can not expect serial column to not have holes.

You can implement gapless key by sacrificing concurrency like this:

create table registry_last_id (value int not null);
insert into registry_last_id values (-1);

create function next_registry_id() returns int language sql volatile
as $$
    update registry_last_id set value=value+1 returning value
$$;

create table registry ( id int primary key default next_registry_id(), ... )

But any transaction, which tries to insert something to registry table will block until other insert transaction finishes and writes its data to disk. This will limit you to no more than 125 inserting transactions per second on 7500rpm disk drive.

Also any delete from registry table will create a gap.

This solution is based on article Gapless Sequences for Primary Keys by A. Elein Mustain, which is somewhat outdated.

Tometzky
A: 

Thanks to the answers from Matthew Wood and Frank Heikens i think i have a solution.

Instead of using serial field I have to create my own sequence and define CACHE parameter to 1. This way postgres will not cache values and each one will be taken directly from the sequence :)

Thanks for all your help :)

Kayo
No - this would not make your key gapless. It would not change anything from what you had as "cache" is 1 by default - also for serial columns. Any aborted transaction - because of illegal value error or crash etc. - will use up a value which will not be put into your table.
Tometzky
Take a look at my second comment to Frank. This setting is hard-coded into the source code and compiled in. It is not related to the CACHE setting. You will still lose 31 numbers on each crash.
Matthew Wood