views:

74

answers:

2

Hello!

I'm writing an application that will have to be able to handle many concurrent accesses to it, either by threads as by processes. So no mutex'es or locks should be applied to this.

To make the use of locks go down to a minimum, I'm designing for the file to be "append-only", so all data is first appended to disk, and then the address pointing to the info it has updated, is changed to refer to the new one. So I will need to implement a small lock system only to change this one int so it refers to the new address. How is the best way to do it?

I was thinking about maybe putting a flag before the address, that when it's set, the readers will use a spin lock until it's released. But I'm afraid that it isn't at all atomic, is it? e.g.

  • a reader reads the flag, and it is unset
  • on the same time, a writer writes the flag and changes the value of the int
  • the reader may read an inconsistent value!

I'm looking for locking techniques but all I find is either for thread locking techniques, or to lock an entire file, not fields. Is it not possible to do this? How do append-only databases handle this?

edit: I was looking at how append-only db's (couchDB) do it, and it seems they use a thread only to serialize the writes to file. Does that mean it isn't possible to make them embeddable, like sqlite, without locking the entire file with file system locks?

Thanks! Cauê

+1  A: 

When I need to do something like this, typically, I write a process that accepts multiple connections from other processes to get data. This logging process can maintain a single file pointer where it is writing all the data without running the risk of multiple writes going to the same place.

Each thread in the logging process will just listen for new input and submit it to the queue, without blocking the process that generated the data. Trying to do this (writing out to disk) in the threads that generate the data to be logged will eventually put you in a position where you have to have locking operations and suffer whatever performance hit they require.

Thanks for your answer!But still, if a reader happens to be reading the data the writer process is writing to - even if it's only an int pointer, it can still catch it in an inconsistent state, can't it?Also, I'd like to avoid using this kind of process communication, since I'd like to make this database prototype embeddable, like sqlite.
Waneck
+1  A: 

Be careful about the append semantics of your filesystem - it probably doesn't provide atomic append operations.

One option is to memory map (mmap) your file as shared, then do atomic memory operations like compare-and-swap on the pointer. Your success will depend on whether your OS has such an operation (Linux, OSX do).

A correct (although I'm not sure it is fast) way accomplish what you want is with rename - it is an atomic file operation on most filesystems. Keep the most up-to-date data in an official file location. To update the data, write your new data to a temporary file, then rename that temporary file to the official location.

Keith Randall
Thanks for your answer! I didn't know mmap could also write to file, I thought it would only read from it. It could very much be an option, if the windows equivalent (MapViewOfFile) also works as expected. But I don't know if there is a cross-processor way to use a compare-and-swap function.About the append semantics, I think a normal filesystem file lock would work fine on this case, wouldn't it?A rename would be out of question. It's a database prototype I'm working on, and it should have to handle with very big files, and constant writes.
Waneck
Is it a good workflow to mmap pages to arrays and then use compare-and-swap on them if write is needed. Would mmap pages be slow? Is mmap guaranteed to be atomic? I haven't found any reference that says it is, but none that says it isn't either!
Waneck
Sounds like a fine workflow to me. mmap'd pages aren't slow (at least on linux), in fact they should be faster than using read/write as they avoid any copying (the virtual memory is mapped directly to the page in the filesystem cache). It has got to be the same on Windows for the scheme to work at all - the two mmap'd regions in the two processes need to map to the same physical memory for compare-and-swap to work at all.The mmap call being atomic or not is irrelevant to your situation. The only atomicity you need is the operation of compare-and-swap on already mmap'd memory.
Keith Randall
But can't it happen that I use a compare-and-swap on the mmap, but the time it will take to actually update it from memory to the actual hard disk could cause some sync issues?
Waneck
Oh, I see, they will all be in memory!It could actually be a pretty neat way to solve this! THe problem is that it isn't really cross-process... But I think I can live with that limitation, no two processes will be able to modify the same file.
Waneck
It should work cross-process as well, as long as you mmap it shared, the same physical memory page will back all of the mmap'd regions.
Keith Randall
Really?? Very Cool!!! Thank you very much!
Waneck