How would I implement separate databases for reading and writing operations?

views:

185

answers:

+11 Q:

How would I implement separate databases for reading and writing operations?

I am interested in implementing an architecture that has two databases one for read operations and the other for writes. I have never implemented something like this and have always built single database, highly normalised systems so I am not quite sure where to begin. I have a few parts to this question.

1. What would be a good resource to find out more about this architecture?
2. Is it just a question of replicating between two identical schemas, or would your schemas differ depending on the operations, would normalisation vary too?
3. How do you insure that data written to one database is immediately available for reading from the second?

Any further help, tips, resources would be appreciated. Thanks.

EDIT
After some research I have found this article which I found very informative for those interested..

http://www.codefutures.com/database-sharding/

I found this highscalability article very informative

+1 A:

In regards to questions 2:

It really depends on what you are trying to achieve by having two databases. If it is for performance reasons (which i suspect it may be) i would suggest you look into denormalizing the read-only database as needed for performance. If performance isn't an issue then I wouldn't mess with the read-only schema.

I've worked on similar systems where there would be a read/write database that was only lightly used by administrative users. That database would then be replicated to the read only database during a nightly process.

Question 3: How immediate are we talking here? Less than a second? 10 seconds? Minutes?

Abe Miessler 2010-05-26 16:43:12

+8 A:

I'm not a specialist but the read/write master database and read-only slaves pattern is a "common" pattern, especially for big applications doing mostly read accesses or data warehouses:

it allows to scale (you add more read-only slaves if required)
it allows to tune the databases differently (for either efficient reads or efficient writes)

What would be a good resource to find out more about this architecture?

There are good resources available on the Internet. For example:

Highscalability.com has good examples (e.g. Wikimedia architecture, the master-slave category,...)
Handling Data in Mega Scale Systems (starting from slide 29)
MySQL Scale-Out approach for better performance and scalability as a key factor for Wikipedia’s growth
Chapter 24. High Availability and Load Balancing in PostgreSQL documentation
Chapter 16. Replication in MySQL documentation
http://www.google.com/search?q=read%2Fwrite+master+database+and+read-only+slaves

Is it just a question of replicating between two identical schemas, or would your schemas differ depending on the operations, would normalisation vary too?

I'm not sure - I'm eager to read answers from experts - but I think the schemas are identical in traditional replication scenari (the tuning may be different though). Maybe people are doing more exotic things but I wonder if they rely on database replication in that case, it sounds more like "real-time ETL".

How do you insure that data written to one database is immediately available for reading from the second?

I guess you would need synchronous replication for that (which is of course slower than asynchronous). While some databases do support this mode, not all do AFAIK. But have a look at this answer or this one for SQL Server.

Pascal Thivent 2010-05-26 16:58:20

@Pascal Thivent ~ Wouldn't this be http://en.wikipedia.org/wiki/Command-query_separation

drachenstern 2010-05-26 17:08:04

@drachenstern: I'm not sure CQS implies anything about the way you store data. But thanks, it's a very interesting link.

Pascal Thivent 2010-05-26 17:25:35

@Pascal Thivent ~ I was thinking more about the reference of having a pub/sub sync arch between the two. I was also thinking that article had images when I blind linked to it: http://www.udidahan.com/2008/08/11/command-query-separation-and-soa/

drachenstern 2010-05-26 17:30:29

@drachenstern: Ah, yes, didn't check that link and I'm going to read that. Thanks again.

Pascal Thivent 2010-05-26 18:33:53

@Pascal Thivent ~ Well I could always be wrong you know, might not be what he intends ... May be totally off base. Something I'm considering how to navigate ours to atm, but I don't think our databases are well suited for this.

drachenstern 2010-05-26 19:06:48

@drachenstern: I went through the second link and I see the "relation" but think that there is no equivalence (you can do CQS with a single database, CQS is IMHO more a pure OOP principle). Interesting read though.

Pascal Thivent 2010-05-26 22:20:07

+3 A:

You might look up data warehouses. These serve as 'normalized for reporting' type databases, while you can keep a normalized OLTP style instance for the data maintenance.

I don't think the idea of 'immediate' equivalence will be a reality. There will be some delay while the new data and changes are migrated in to the other system. The schedule and scope will be your big decisions here.

Randy 2010-05-26 17:21:32

+1. Matt - look up OLTP and OLAP. Use one database to support fast transactional work (such as to support an app), and ETL data as required to a another database built to be reported off. As they are separate you don't get performance hits on one affecting the other.Moving data between the databases is a complex issue on it's own - depending on how much data there is to move, and how often; when people say they want "real-time" replication between sources what do they actually mean (you need to verify) because real-time to a computer is much faster than 'real-time' to a human.

Adrian K 2010-05-27 08:06:55

ansaurus

tags:

views:

answers:

How would I implement separate databases for reading and writing operations?

related questions