Hi,
(PostreSQL 8.2, running on WindowsXP)
I have these lots of complexes queries that takes several seconds to run each. They are not "views" actually, but can be treated as so.
I decided to persist the result records from those "views" into tables, that I call "aux" (auxiliary) tables.
I can guarantee that there is no data change after calculating "aux" tables.
Let's take one example:
Supose I have a Query "X", so I persist it's results in "Table_X". The record set is something like this:
PERSON* FIELD_A* FIELD_ B FIELD_C
=======================================================
1 10 Value1 Value2
1 20 Value3 Value4
1 30 Value5 Value6
------------------------------------------------------
2 10 Value1 Value2
2 20 Value3 Value4
------------------------------------------------------
3 20 Value3 Value4
3 30 Value5 Value6
------------------------------------------------------
etc..
(*)Primary key is: person, field_a
As you can see, each "person" has his subset of records in this table.
So, I can fetch quickly his records just with
"select * from table_x where person = <person>"
.
I will always fetch ONLY by <person>
, and all my queries has the same "face": "PERSON" + Some_Fields.
IMPORTANT: All "aux" tables can be readed (obviously, with "old" data until I commit) by others transactions while I'm "re-populating" them. But I can guarantee that they are never updated by those transactions.
My current process is:
- START TRANSACTION;
- DO A LOTS OF OPERATIONS ON DATABASE. INSERT / UPDATE / DELETE ON SEVERAL TABLES.
- AFTER THAT, I WILL CALCULATE "AUX" TABLES
- LOOP THROUGH ALL MY "QUERIES": (WHERE HERE WE CAN CALL AS "X")
- LOOP TROUGHT ALL "PERSON": (WHERE HERE WE CAN CALL AS <person>)
- DELETE FROM <TABLE_X> WHERE PERSON = <person>;
- INSERT INTO <TABLE_X> (PERSON, FIELD_A, FIELD_B, FIELD_C)
(SELECT <person>,
FIELDS...
FROM "LOTS OF TABLES"
JOIN "COMPLEX SQL"...
WHERE SOME_FIELD = <person>
);
- END LOOP "PERSON"
- END LOOP "QUERIES"
- COMMIT;
Considerations:
Some of those tables has thousands of records, and often only just a few records need to be update / delete / insert if comparing with the record set already "existing" in the table.
As deleting and "re-inserting" is causing too much "disk i/o" (so obvious), and I need to "update" just a few records, I'm trying to get an efficient way of doing it.
I tried to delete / update / insert in separeted steps, doing it directly from "complex query", but it takes too much time, because query was executed 3 times (once for delete, other for update e another for insert).
Any suggestions?