views:

76

answers:

2

Amazon S3 provides an "eventual consistency" model, where data you store is eventually visible to all clients.

I can't find any official information as to whether write ordering is guaranteed or not. This is fundamentally important if you are building an architecture where a client might want to read data right after someone else stored it.

If write ordering is preserved, I can easily check whether the data is complete by having the writer store a guard (say, a special key) and the end of the write operation.

If no write ordering is guaranteed, then I have a serious problem, because there is no way I can be sure I can safely read the data.

I read Werner Vogel's article on consistency (http://www.allthingsdistributed.com/2007/12/eventually_consistent.html), where he notes that systems without monotonic write consistency are notoriously hard to program, but does not say whether S3 guarantees it or not.

+1  A: 

To answer your direct question, 'Does Amazon S3 guarantee write ordering?', I think the answer is no. I'd ask in the AWS forums just to be sure though.

If you can live with near-perfect consistency I'd recommend having your writer check for all the files before returning success. While this doesn't guarantee that the file will be available to all clients, it's good enough for most use cases. Personally, I've never had a problem with the consistency of S3 for clients when I've done this. I should note that this will take a bit more time and you'll pay a bit more for the extra requests, but in your case, it's probably worth it.

On the other hand, if you need guaranteed perfect consistency, I'd recommend using a region other than the US Standard region. All regions except for the US Standard region offer "read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES". This should solve your problem for new writes, but again, it'll cost a bit more.

Best,

Zach

Developer, LongTail Video

zach at longtail
+1  A: 

It would be very foolish of Amazon to guarantee write ordering when it's goal is massive scalability. Consider the following case:

  • A value X associated with key K exists in S3.
  • Client 1 (Hong Kong): writes value A to key K, serviced by Server A
  • Client 2 (Kansas City): writes value B to key K, serviced by Server B

Eventual consistency guarantees that all readers, any where in the world, will either see value A or value B or value X, and eventually, all readers will see either value A or B, but not a mix.

If Client 1 and 2 issued their writes at the same time then the only way to guarantee write ordering is to associate the writes with each other along a time line. However, transatlantic clocks won't be perfectly synchronized. The real question here is what do you mean by write-ordering, when two clients at opposite ends of the world issue a write close together in time.

Update

The same holds true for a single client. Suppose that a value is served from 2 locations, and a single client issues 2 consecutive updates. If both writes are served by the same end-point then it is likely your ordering will be preserved. However, nothing prevents a read from being satisfied from a second location (routing, network split, etc...).

Noah Watkins
What I mean by write-ordering is "single-client" write ordering. E.g. when a _single client_ makes a bunch of writes, will all others see the writes in the same order they were made?
Jan Rychter
I updated the description:The same holds true for a single client. Suppose that a value is served from 2 locations, and a single client issues 2 consecutive updates. If both writes are served by the same end-point then it is likely your ordering will be preserved. However, nothing prevents a read from being satisfied from a second location (routing, network split, etc...).
Noah Watkins