views:

227

answers:

2

I've been evaluating large key-value stores recently and I keep coming acrosss the term 'read-repair' but have no clue what they are talking about. I think it has something to do with transactions but am not sure.

Could someone please explain what it is and how it is different from the way traditional databases work? Maybe provide some pseudo code to help explain?

+1  A: 

I think read-repair means theres 2 copys of the data on separate nodes.

On: http://highscalability.com/drop-acid-and-think-about-data I found this:

Read repair - When a client does a read and the nodes disagree on the data it's up to the client to select the correct data and tell the nodes the new correct state.

I hope this is correct :)

Paul Janaway
yeh -- I think that's where I found it originally -- your ref though seemed to refer to 'eventually consistent' under bigtable -- are we talking about the same thing?
feydr
+2  A: 

To improve scalability, many scalable key-value stores allow you to write to only a majority of the replicas for a piece of data. (So, if you have 5 replicas, you only have to write to 3 of them). When you read, you make sure to read from a majority of the replicas. That way you're guaranteed to read at least one replica that has the newest value.

Read Repair means when you detect that some of the replicas have older values, you update them with the newer value just to reduce the number of obsolete values in the system. This is an example of a "Anti-Entropy" procedure.

Anonymous