tags:

views:

1697

answers:

2

I get the following error when inserting data from mysql into postgres.

Do I have to manually remove all null characters from my input data? Is there a way to get postgres to do this for me?

ERROR: invalid byte sequence for encoding "UTF8": 0x00
+3  A: 

PostgreSQL doesn't support storing NULL (\0x00) characters in text fields (this is obviously different from the database NULL value, which is fully supported).

If you need to store the NULL character, you must use a bytea field - which should store anything you want, but won't support text operations on it.

Given that PostgreSQL doesn't support it in text values, there's no good way to get it to remove it. You could import your data into bytea and later convert it to text using a special function (in perl or something, maybe?), but it's likely going to be easier to do that in preprocessing before you load it.

Magnus Hagander
+1  A: 

You can first insert data into blob field and then copy to text field with the folloing function

CREATE OR REPLACE FUNCTION blob2text() RETURNS void AS $$
Declare
    ref record;
    i integer;
Begin
    FOR ref IN SELECT id, blob_field FROM table LOOP

          --  find 0x00 and replace with space    
      i := position(E'\\000'::bytea in ref.blob_field);
      WHILE i > 0 LOOP
     ref.bob_field := set_byte(ref.blob_field, i-1, 20);
     i := position(E'\\000'::bytea in ref.blobl_field);
      END LOOP

    UPDATE table SET field = encode(ref.blob_field, 'escape') WHERE id = ref.id;
    END LOOP;

End; $$ LANGUAGE plpgsql;

SELECT blob2text();

Raido