I'm quite a newbie with PostgreSQL optimization and chosing whatever's appropriate job for it and whatever's not. So, I want to know whenever I'm trying to use PostgreSQL for inappropriate job, or it is suitable for it and I should set everything up properly.
Anyway, I have a need for a database with a lot of data that changes frequently.
For example, imagine an ISP, having a lot of clients, each having a session (PPP/VPN/whatever), with two self-describing frequently updated properties bytes_received
and bytes_sent
. There is a table with them, where each session is represented by a row with unique ID:
CREATE TABLE sessions(
id BIGSERIAL NOT NULL,
username CHARACTER VARYING(32) NOT NULL,
some_connection_data BYTEA NOT NULL,
bytes_received BIGINT NOT NULL,
bytes_sent BIGINT NOT NULL,
CONSTRAINT sessions_pkey PRIMARY KEY (id)
)
And as accounting data flows, this table receives a lot of UPDATEs like those:
-- There are *lots* of such queries!
UPDATE sessions SET bytes_received = bytes_received + 53554,
bytes_sent = bytes_sent + 30676
WHERE id = 42
When we receive a never ending stream with quite a lot (like 1-2 per second) of updates for a table with a lot (like several thousands) of sessions, probably thanks to MVCC, this makes PostgreSQL very busy. Are there any ways to speed everything up, or Postgres is just not exactly suitable for this task and I'd better consider it unsuitable for this job and put those counters to another storage like memcachedb, using Postgres only for fairly static data? But I'll miss an ability to infrequently query on this data, for example to find TOP10 downloaders, which is not really good.
Unfortunately, the amount of data cannot be lowered much. The ISP accounting example is all thought up to simplify the explanation. The real problem's with another system, which structure is somehow harder to explain.
Thanks for suggestions!