I have a data structure that looks like this:
Model Place
primary key "id"
foreign key "parent" -> Place
foreign key "neighbor" -> Place (symmetryc)
foreign key "belongtos" -> Place (asymmetric)
a bunch of scalar fields ...
I have over 5 million rows in the model table, and I need to insert ~50 million rows into each of the two foreign key tables. I have SQL
files that look like this:
INSERT INTO place_belongtos (from_place_id, to_place_id) VALUES (123, 456);
and they are about 7 Gb each. The problem is, when I do psql < belongtos.sql
, it takes me about 12 hours to import ~4 million rows on my AMD Turion64x2 CPU. OS is Gentoo ~amd64, PostgreSQL is version 8.4, compiled locally. The data dir is a bind mount, located on my second extended partition (ext4
), which I believe is not the bottleneck.
I'm suspecting it takes so long to insert the foreign key relations because psql
checks for the key constraints for each row, which probably adds some unnecessary overhead, as I know for sure that the data is valid. Is there a way to speed up the import, i.e. temporarily disabling the constraint check?