views:

57

answers:

3

Hi,

I have a table called _sample_table_delme_data_files which contains some duplicates. I want to copy its records, without duplicates, into data_files:

INSERT INTO data_files (SELECT distinct * FROM _sample_table_delme_data_files);
ERROR:  could not identify an ordering operator for type box3d
HINT:  Use an explicit ordering operator or modify the query.

Problem is, PostgreSQL can not compare (or order) box3d types. How do I supply such an ordering operator so I can get only the distinct into my destination table?

Thanks in advance,

Adam

+2  A: 

The datatype box3d doesn't have an operator for the DISTINCT-operation. You have to create the operator, or ask the PostGIS-project, maybe somebody has already fixed this problem.

Frank Heikens
+2  A: 

If you don't add the operator, you could try translating the box3d data to text using its output function, something like:

INSERT INTO data_files (SELECT distinct othercols,box3dout(box3dcol) FROM _sample_table_delme_data_files);

Edit The next step is: cast it back to box3d:

INSERT INTO data_files SELECT othercols, box3din(b) FROM (SELECT distinct othercols,box3dout(box3dcol) AS b FROM _sample_table_delme_data_files);

(I don't have box3d on my system so it's untested.)

Edmund
+1 This is nice, but it changes the box3d column.
Adam Matan
Yup -- I should have said, the next step is to cast it back to box3d...
Edmund
A: 

Finally, this was solved by a colleague.

Let's see how many dups are there:

SELECT COUNT(*) FROM _sample_table_delme_data_files ;
 count                                                               
-------                                                              
 12728                                                               
(1 row)

Now, we shall add another column to the source table to help us differentiate similar rows:

ALTER TABLE _sample_table_delme_data_files ADD COLUMN id2 serial;

We can now see the dups:

SELECT id, id2 FROM _sample_table_delme_data_files ORDER BY id LIMIT 10;
   id   | id2                                                                           
--------+------                                                                         
 198748 | 6449                                                                          
 198748 |   85                                                                          
 198801 |  166                                                                          
 198801 | 6530                                                                          
 198829 |   87                                                                          
 198829 | 6451                                                                          
 198926 |   88                                                                          
 198926 | 6452                                                                          
 199062 | 6532                                                                          
 199062 |  168                                                                          
(10 rows)       

And remove them:

DELETE FROM _sample_table_delme_data_files 
    WHERE id2 IN (SELECT max(id2) FROM _sample_table_delme_data_files 
                         GROUP BY id 
                               HAVING COUNT(*)>1);

Let's see it worked:

SELECT id FROM _sample_table_delme_data_files GROUP BY id HAVING COUNT(*)>1;
 id
----
(0 rows)

Remove the auxiliary column:

ALTER TABLE _sample_table_delme_data_files DROP COLUMN id2;
ALTER TABLE

Insert the remaining rows into the destination table:

INSERT INTO data_files (SELECT * FROM _sample_table_delme_data_files);
INSERT 0 6364
Adam Matan
So actually you just needed to distinctify on the 'id' column rather than all columns (including the geometrical one)? That would have made a big difference to the answers if you'd explained it in the question.
araqnid
You're right, that's why I've marked @Edmund 's answer as the correct one; I hope this can be helpful, too.
Adam Matan