ansaurus

Question

Postgresql PITR backup: best practices to handle multiple databases?

Answer 1

+1 A:

Why do you want to separate the databases?

The way the PITR works, it is not possible to do since it works on the complete cluster. What you can do in that case is to create a data directory and a separate cluster for each of those databases (not recommended though since it will require different ports, and postmaster instances).

I believe that the benefits of using PITR instead of regular dumps outweigh having separate backups for each database, so perhaps you can re-think the reasons for why you need to separate it.

Another way could be to set up some replication with Slony-I but that would require a separate machine (or instance) that receives the data. On the other hand, that way you would have a replicated system in near real-time.

Update for comment:

To recover from mistakes, like deleting a table, PITR would be perfect since you can replay to a specific time. However, for 500 databases I understand that can be a lot of overhead. Slony-I would probably not work, since it is replicating. Not sure how it handles table deletions.

I am not aware of any other ways you can go. What I would do would probably still be going for PITR and just not do any mistakes ;). Jokes aside, depending how frequently mistakes are being made this could be a solution:

Set it up for PITR
have a second instance ready on standby.
When a mistake happens, replay the restore to the point in time on the second instance.
Do a pg_dump of the affected database from that instance.
Do a pg_restore on the production instance for that database.

However, it would require you to have a second instance ready, either on the same server or a different one (different is recommended). Also, the restore time would be a bit longer since it would require you to do one extra dump and restore.

Jimmy Stenke 2009-07-15 14:04:32

Becose every database is indipendent.If i delete a table in the database1, and after 10 minutes i see the mistake, i want to restore just the database1, not all of them... the pg_dump mode works fine, but i cant make pg_dump every 10 seconds (i have around 500 databases...)

DaNieL 2009-07-15 15:31:28

Answer 2

+2 A:

It is not possible with one instance of PostgresSQL.

You can divide your 500 tables between several instances, each listening on different port, but it would mean that they will not use resources like memory effectively (memory reserved but unused in one instance can not be used by another).

Slony will also not work here, as it does not replicate DDL statements, like dropping a table.

I'd recommend doing both:

continue to do your pg_dump backups, but try to smooth it - throttle pg_dump io bandwith, so it will not cripple a server, and run it continuously - when it finishes with the last database then immediately start with a first one;
additionally setup PITR.

This way you can restore a single database fast, but you can loose some data. If you'll decide that you cannot afford to loose that much data then you can restore your PITR backup to a temporary location (with fsync=off and pg_xlog symlinked to ramdisk for speed), pg_dump affected database from there and restore it to your main database.

Tometzky 2009-07-15 16:49:23

Can you please explain better your first point -throttle pg_dump io bandwith- ? i havent understood it completely

DaNieL 2009-07-16 06:10:25

Use for example a throttle program (http://klicman.org/throttle/ GPLv2 license). For example: "pg_dump dbname | throttle -M 3 > dbname.sql" will limit pg_dump to 3MBps, which will not have much impact on database performance. Choose your throttling wisely - smaller limit - it will take more time, bigger - database performance will suffer more.

Tometzky 2009-07-16 11:41:25

Mmmh, throttle can be really a good things..

DaNieL 2009-07-17 06:13:35

Thinking about it, throttle wont help me much after all.. but, this raise another idea, there is a way to run the pg_dump in 'background' ? I mean, can i set something like 2% cpu and.. dunno, 10Mb ram, and a way that avoid pg_dump to exceed those limits? But i think the problem is not the pg_dump itslef, but postgresql.. it have to be stressed at the same way to supply pg_dump... oh, im little confused -.-

DaNieL 2009-07-17 06:18:50

Answer 3

A:

After googling a lot and reading your answers, seems there is no a real-life method to use pitr keeping databases indipendens.

Creating multiple postmaster's instances, db clusters, is kindly impossible (the number of database is actually 420,but will grown up (i hope that) dinamycally... without human control).

So, i'll keep dumping the dbs every 2 hours, im just sorry for not being able to give my customers a pitr guarantee ;(

Thanks guys for the replyes guys.

DaNieL 2009-07-17 06:33:13

Answer 4

A:

I think the way you are doing this is flawed. You should have one database with multiple schemas and roles. Then you can use PITR. However PITR is not a replacement for dumps.

2009-07-17 19:51:47

This will let me to restore just 1 schema with PITR and with the dumps?

DaNieL 2009-07-19 10:37:40

If i use 1 database and many schemas, will PITR allow me to restore (easly) just 1 schema at time?

DaNieL 2009-07-20 08:44:27

ansaurus

tags:

views:

answers:

Postgresql PITR backup: best practices to handle multiple databases?

related questions