views:

1094

answers:

6

Keep in mind that I am a rookie in the world of sql/databases.

I am inserting/updating thousands of objects every second. Those objects are actively being queried for at multiple second intervals.

What are some basic things I should do to performance tune my (postgres) database?

+1  A: 

The absolute minimum I'll recommend is the EXPLAIN ANALYZE command. It will show a breakdown of subqueries, joins, et al., all the time showing the actual amount of time consumed in the operation. It will also alert you to sequential scans and other nasty trouble.

It is the best way to start.

Tony k
+4  A: 

First and foremost, read the official manual's Performance Tips.

Running EXPLAIN on all your queries and understanding its output will let you know if your queries are as fast as they could be, and if you should be adding indexes.

Once you've done that, I'd suggest reading over the Server Configuration part of the manual. There are many options which can be fine-tuned to further enhance performance. Make sure to understand the options you're setting though, since they could just as easily hinder performance if they're set incorrectly.

Remember that every time you change a query or an option, test and benchmark so that you know the effects of each change.

Ben S
+6  A: 

It's a broad topic, so here's lots of stuff for you to read up on.

  • EXPLAIN and EXPLAIN ANALYZE is extremely useful for understanding what's going on in your db-engine
  • Make sure relevant columns are indexed
  • Make sure irrelevant columns are not indexed (insert/update-performance can go down the drain if too many indexes must be updated)
  • Make sure your postgres.conf is tuned properly
  • Know what work_mem is, and how it affects your queries (mostly useful for larger queries)
  • Make sure your database is properly normalized
  • VACUUM for clearing out old data
  • ANALYZE for updating statistics (statistics target for amount of statistics)
  • Persistent connections (you could use a connection manager like pgpool or pgbouncer)
  • Understand how queries are constructed (joins, sub-selects, cursors)
  • Caching of data (i.e. memcached) is an option

And when you've exhausted those options: add more memory, faster disk-subsystem etc. Hardware matters, especially on larger datasets.

And of course, read all the other threads on postgres/databases. :)

A: 

Put fsync = off in your posgresql.conf, if you trust your filesystem, otherwise each postgresql operation will be imediately written to the disk (with fsync system call). We have this option turned off on many production servers since quite 10 years, and we never had data corruptions.

fredz
This is BAD advice. You risk corrupting your data. Of course you might get lucky for some years, as you have. The same gain can be had by using a raid-controller with a battery-backed write cache - no additional risk.
We trust our ext3 filesystems. A write cache is limited. For example, we maintain the Century21 France database since 8 years ; more than 3000 persons are writing to this database in real time.We have a home-made middleware to mirror all queries in an another database in case of server crash, but we never had any problem.See : http://www.postgresql.org/docs/8.1/interactive/runtime-config-wal.html
fredz
+1  A: 

Actually there are some simple rules which will get you in most cases enough performance:

  1. Indices are the first part. Primary keys are automatically indexed. I recommend to put indices on all foreign keys. Further put indices on all columns which are frequently queried, if there are heavily used queries on a table where more than one column is queried, put an index on those columns together.

  2. Memory settings in your postgresql installation. Set following parameters higher:

.

shared_buffers, work_mem, maintenance_work_mem, temp_buffers

If it is a dedicated database machine you can easily set the first 3 of these to half the ram (just be carefull under linux with shared buffers, maybe you have to adjust the shmmax parameter), in any other cases it depends on how much ram you would like to give to postgresql.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html

Mauli
PKs are auto-indexed? How come they do not show up under the "indexes" list in the pgAdmin tool?
Grasper