views:

253

answers:

2

I've been working on text parsing task when I found strange Postgres behavior. My original code exposing strange error was written in Java with JDBC connectivity for PostgreSQL (v8.3.3 and v8.4.2 tested), here is my original post: Is it an error of PostgreSQL SQL engine and how to avoid (workaround) it?. I've just ported my Java code given there to pure plpgsql and it gives the same errors (same behavior as described in original post).

Simplified code has now nothing to do with parsing - it just generates pseudo-random (but repeatable) words and inserts them after normalizing (table spb_word holds unique words and ids, they are referenced by id in final table spb_obj_word and table spb_word4obj works as input buffer).

Here are my tables (c&p from OP):

create sequence spb_word_seq;

create table spb_word (
  id bigint not null primary key default nextval('spb_word_seq'),
  word varchar(410) not null unique
);

create sequence spb_obj_word_seq;

create table spb_obj_word (
  id int not null primary key default nextval('spb_obj_word_seq'),
  doc_id int not null,
  idx int not null,
  word_id bigint not null references spb_word (id),
  constraint spb_ak_obj_word unique (doc_id, word_id, idx)
);

create sequence spb_word4obj_seq;

create table spb_word4obj (
  id int not null primary key default nextval('spb_word4obj_seq'),
  doc_id int not null,
  idx int not null,
  word varchar(410) not null,
  word_id bigint null references spb_word (id),
  constraint spb_ak_word4obj unique (doc_id, word_id, idx),
  constraint spb_ak_word4obj2 unique (doc_id, word, idx)
);

and code ported to plpgsql from original Java code:

create sequence spb_wordnum_seq;

create or replace function spb_getWord() returns text as $$
declare
  rn int;
  letters varchar(255) :=   'ąćęłńóśźżjklmnopqrstuvwxyz';
                          --'abcdefghijklmnopqrstuvwxyz';
  llen int := length(letters);
  res text := '';
  wordnum int;
begin
  select nextval('spb_wordnum_seq') into wordnum;

  rn := 3 * (wordnum + llen * llen * llen);
  rn := (rn + llen) / (rn % llen + 1);
  rn := rn % (rn / 2 + 10);

  loop
    res := res || substring(letters, rn % llen, 1);
    rn := floor(rn / llen);
    exit when rn = 0;
  end loop;

  --raise notice 'word for wordnum=% is %', wordnum, res;

  return res;
end;
$$ language plpgsql;



create or replace function spb_runme() returns void as $$
begin
  perform setval('spb_wordnum_seq', 1, false);
  truncate table spb_word4obj, spb_word, spb_obj_word;

  for j in 0 .. 50000-1 loop

    if j % 100 = 0 then raise notice 'j = %', j; end if;

    delete from spb_word4obj where doc_id = j;

    for i in 0 .. 20 - 1 loop
      insert into spb_word4obj (word, idx, doc_id) values (spb_getWord(), i, j);         
    end loop;

    update spb_word4obj set word_id = w.id from spb_word w 
    where w.word = spb_word4obj.word and doc_id = j;

    insert into spb_word (word) 
    select distinct word from spb_word4obj 
    where word_id is null and doc_id = j;

    update spb_word4obj set word_id = w.id 
    from spb_word w 
    where w.word = spb_word4obj.word and 
    word_id is null and doc_id = j;

    insert into spb_obj_word (word_id, idx, doc_id) 
    select word_id, idx, doc_id from spb_word4obj where doc_id = j;
  end loop;
end;
$$ language plpgsql;

To run this simply execute select spb_runme() as SQL statement.

Here is first example of error:

NOTICE:  j = 8200
ERROR:  duplicate key value violates unique constraint "spb_word_word_key"
CONTEXT:  SQL statement "insert into spb_word (word) select distinct word from spb_word4obj where word_id is null and doc_id =  $1 "
PL/pgSQL function "spb_runme" line 18 at SQL statement

and second one:

NOTICE:  j = 500
ERROR:  null value in column "word_id" violates not-null constraint
CONTEXT:  SQL statement "insert into spb_obj_word (word_id, idx, doc_id) select word_id, idx, doc_id from spb_word4obj where doc_id =  $1 "
PL/pgSQL function "spb_runme" line 27 at SQL statement

These errors occur in unpredictable manner - every time in different iteration (j) and with different words causing error.

When polish national characters (ąćęłńóśźż) are eliminated from generated words (line letters varchar(255) := 'ąćęłńóśźżjklmnopqrstuvwxyz'; becomes letters varchar(255) := 'abcdefghijklmnopqrstuvwxyz';) there is no error! My DB is created with UTF-8 encoding, so there should be no problem with non-ascii chars, but apparently it is very important!

Now my question: what is wrong with my code? Or is it something serious wrong with PostgreSQL? How to work around this error?

BTW: If it is an error in PostgreSQL engine, then how this DB can be trustworthy? Should I move to one of free alternatives (e.g. MySQL)?


UPDATE: extra explanations (mainly for OMG Ponies)

If I remove unnecessary delete - I still have the same errors.

Function spb_getWord() must generate words with duplicates - it simulates text parsing and dividing it to words - and some words get repeated - that is normal and rest of my code is dealing with duplicates. Because of possible duplicates generated by spb_getWord() I insert words to buffer table spb_word4obj and then I update word_id in this table for already processed words from spb_word. So now - if row in spb_word4obj has word_id not null - then it is a duplicate, so I will not insert this word into spb_word. But - as OMG Ponies mentioned, I get error duplicate key value violates unique constraint which means my code that handles duplicates properly fails. I.e. my code fails because of internal Postgres error - proper code is somehow badly executed by Postgres and fails.

After inserting new words (duplicates recognized and marked not to be inserted) into spb_word my code finally inserts normalized words into spb_obj_word - replacing word body with reference to unduplicated entry in spb_word, but this again fails sometimes because of Postgres internal error. Again I think my code is proper but it fail because there is problem in Postgres SQL engine itself.

Adding or removing polish national letters from generated words by spb_getWord only assures me that it is strange Postgres error - all unique/duplicates considerations remain the same but allowing/disallowing some letters from words leads to errors or eliminates them. So it is not the case of error in my code - unproper handling of duplicates.

Second thing that assures me that it is not error in my code is unpredictable moment of errors that are detected. Every run of my code does the same sequence of words so it should always break in the same place with the same value causing the error. But it is not - it is quite random moment that it fails.

+2  A: 

NOTICE: j = 8200
ERROR: duplicate key value violates unique constraint "spb_word_word_key"
CONTEXT: SQL statement "insert into spb_word (word) select distinct word from spb_word4obj where word_id is null and doc_id = $1 "
PL/pgSQL function "spb_runme" line 18 at SQL statement

...is telling you that your spb_getWord() is generating values that already exist in the SPB_WORD table. You need to update the function to check if the word already exists before exiting the function - if it does, re-generate until it hits one that doesn't.

I think your spb_runme() needs to resemble:

create or replace function spb_runme() returns void as $$
DECLARE
  v_word VARCHAR(410);

begin
  perform setval('spb_wordnum_seq', 1, false);
  truncate table spb_word4obj, spb_word, spb_obj_word;

  for j in 0 .. 50000-1 loop

    if j % 100 = 0 then raise notice 'j = %', j; end if;

    for i in 0 .. 20 - 1 loop
      v_word := spb_getWord();
      INSERT INTO spb_word (word) VALUES (v_word);

      INSERT INTO spb_word4obj 
        (word, idx, doc_id, word_id)
        SELECT w.word, i, j, w.id
          FROM SPB_WORD w 
         WHERE w.word = v_word;

    end loop;

    INSERT INTO spb_obj_word (word_id, idx, doc_id) 
    SELECT w4o.word_id, w4o.idx, w4o.doc_id 
      FROM SPB_WORD4OBJ w4o 
     WHERE w40.doc_id = j;

  end loop;
end;

Using this would allow you to change the word_id to not support NULLs. When dealing with foreign keys, populate the table the foreign key references first - start with the parent, and then tackle its children.

The other change I made was to store the spb_getWord() in a variable (v_word), because calling the function multiple times means you'd get a different value every time.

Last thing - I removed the delete statement. You already truncated the table, there's nothing in there to delete. Certainly nothing associated to a value of j.

OMG Ponies
please look at update to my question - I give explanations why it is not a problem with duplicates in `spb_getWord()` but rather strange error of Postgres.
WildWezyr
I've changed my code to do it one by one with subsequent words - without buffer table `spb_word4obj` - now it looks similar to your proposition. But... it still fails in different loop iteration every time. It seems that checking for duplicate passes (no duplicate) but it is wrong and then code fails when inserting word into `spb_word` because of duplicated record.
WildWezyr
please look at my own answer to this question it shows simplest code i was able to get for exposing this error. i'm still hitting unique key constrait and i still think it is postgres error (not my fault).
WildWezyr
+1  A: 

I've managed to simplify test code - now it uses one table. Simplified problem was posted on pgsql-bugs mailing list: http://archives.postgresql.org/pgsql-bugs/2010-01/msg00182.php. It is confirmed to occur on other machines (not only mine).

Here is this simplified version of main test function (it needs one table spb_word, sequences spb_wordnum_seq and spb_word_seq and one function spb_getWord given in my question).

create or replace function spb_runmeSimple2(cnt int) returns void as $$
declare
  w varchar(410);
  wordId int;
begin
  perform setval('spb_wordnum_seq', 1, false);
  truncate table spb_word cascade;

  for i in 1 .. cnt loop

    if i % 100 = 0 then raise notice 'i = %', i; end if;

    select spb_getWord() into w;
    select id into wordId from spb_word where word = w;
    if wordId is null then 
      insert into spb_word (word) values (w);
    end if;

  end loop;
end;
$$ language plpgsql;

Now error occurs (but in unpredictable manner) while executing select spb_runmeSimple2(10000000).

Here is workaround: change database collation from polish to standard 'C'. With 'C' collation there is no error. But without polish collation polish words are sorted incorrectly (with respect to polish national characters), so problem should be fixed in Postgres itself.

WildWezyr
`Flaw`: `i` will increment even if a word already exists. The larger the number of words you want generated, the more likely the counter will increase without actually generating unique words. `Solution`: Check that the word is unique *before* leaving the word generation function.
OMG Ponies
With the Polish collation, try: `SELECT id INTO wordId FROM SPB_WORD WHERE word LIKE w;`
OMG Ponies
@OMG Ponies: 1) there is no flaw with loop control variable `i` - it is meant to increment with every generated word (duplicate or not) and it does correctly increment. 2) Substituting equality (`=`) with `like` helps. This is strange because like and equality are equivalent when there is not percent char (`%`) involved. But this workaround will not help when there are words generated with percent char in it. BTW: do you know why using `like` helps - what difference it makes that is important and eliminates errors?
WildWezyr
@WildWezyr: `i` will increment if a word already exists. That means your counter will run up to your limit without producing an equivalent number of unique words if the word generation produces any duplicates. IE: loop 100 times, and get less 100 actual word records.
OMG Ponies
I can't remember the details to why `LIKE` works vs equality, didn't know Polish had `%` as a valid character.
OMG Ponies
@OMG Ponies: 1) "`i` will increment if a word already exists" - thats perfectly correct - i want to count iterations of the loop, not number of words inserted into spb_word. 2) imagine i want to insert two words `wódka` and `40%` - this is just a joke to show use case of percent char in words ;-).
WildWezyr
That's cool if it is expected behavior. I see your point about allowing `%`, dunno how you'd escape the situation using `LIKE`, sorry.
OMG Ponies