tags:

views:

340

answers:

2

I'm investigating a PostgreSQL-backed application.

CPU usage is constantly over 50% on a modern Xeon with 4GB RAM. Of that 50% CPU utilization, 67% is "user" and 33% is "system" (this is a Linux machine.) The system is not waiting on I/O at all.

I'm wondering how I can see how this CPU time breaks down.

The queries are mostly ad-hoc SQL (no prepared statements) from what I can see.

Do you think this user CPU time could be significantly reduced by moving to prepared statements? i.e. could SQL parse time, query planning time, etc. be taking up this much CPU? Some of the queries are quite chunky (500-1000 characters plus.)

Can anyone confirm if PostgreSQL automatically normalizes ad-hoc queries and cache query plans for them, in effect making them as efficient as a prepared-statement (plus SQL parse time) ?

I will probably implement some higher-level caching to solve this problem, but am curious to know whether anyone thinks it's worth moving this app to prepared statements.

+3  A: 

Assuming you are VACUUMing the database regularly (which is the standard source of PostgreSQL performance problems) I think that the way to win the most performance-wise is to

a) tune your installation for performance based on the machine you're on and

b) analyze each query and find out if it could be optimized further.

I really don't think much will be gained by moving the queries into stored procedures.

Vinko Vrsalovic
Thanks for the links. Do you know if PostgreSQL caches query plans for ad-hoc SQL queries?
IIRC, it does. I'm not 100% sure though. A great resource for PostgreSQL are its mailing lists where the main devs are active and answer a lot of questions.
Vinko Vrsalovic
+1 for links. Thanks!
Luc M
A: 

One trick you may not have seen yet is to use "top -c" to look at your system. With that parameter, you can see what each active Postgres process is actually doing.

Query plans are not cached in any way in the database outside of prepared statements. Regardless, if you're not heavily re-using similar queries, it's unlikely you can drive down query time using prepared statements. You might even make it worse if doing so ends up providing the optimizer with less info to work with, because it's preparing things before it knows all the info about what it's going to do. 1000 characters is far from a chunky query, and unless you have hundreds of connections at once it's really unlikely query parsing or planning is your problem here. It's probably locking issues, bad VACUUM procedures leading to bloated data that needs to be searched to get any work done (really easy to encounter on 8.1), slow constraints, excessive indexes, or a design that doesn't consider the overhead of moving things around memory fully. Query overhead is very low down on the list of suspects.

And if you do have hundreds of connections, you should consider using a connection pooler. PostgreSQL process creation is pretty heavy, and it doesn't do well on its own in that environment.

Shoot, you're running such an old version even of 8.1 that you might be seeing a bug; 8.1.4 is full of them. 8.1.19 is current, and even 8.3.5 is already several useful version upgrades behind current). See Versioning Policy for details on why running an older release is a bigger risk than upgrading in almost every situation.

Greg Smith