views:

321

answers:

1

Hi world,

I intend to have a python script do many UPDATEs per second on 2,433,000 rows. I am currently trying to keep the dynamic column in python as a value in a python dict. Yet to keep my python dict synchronized with changes in the other columns is becoming more and more difficult or nonviable.

I know I could put the autovacuum on overdrive, but I wonder if this would be enough to catch up with the sheer amount of UPDATEs. If only I could associate a python variable to each row...

I fear that the VACUUM and diskwrite overhead will kill my server?

Any suggestions on how to associate extremely dynamic variables to rows/keys?

Thx

+3  A: 

PostgreSQL supports asynchronous notifications using the LISTEN and NOTIFY commands. An application (client) LISTENs for a notification using a notification name (e.g. "table_updated"). The database itself can be made to issue notifications either manually i.e. in the code that performs the insertions or modifications (useful when a large number of updates are made, allowing for batch notifications) or automatically inside a row update TRIGGER.

You could use such notifications to keep your data structures up to date.

Alternatively (or you can use this in combination with the above), you can customize your Python dictionary by overriding the __getitem__(), has_key(), contains() methods and have them perform lookups as needed, allowing you to cache the results using timeouts etc.

codelogic
From the PostgreSQL docs: "NOTIFY behaves like Unix signals in one important respect: if the same notification name is signaled multiple times in quick succession, recipients may get only one notification event for several executions of NOTIFY". Override __getitem__() instead to stay in sync.
j_random_hacker