views:

962

answers:

4

I need to access a file concurrently with multiple threads. This needs to be done concurrently, without thread serialisation for performance reasons.

The file in particular has been created with the 'temporary' file attribute that encourages windows to keep the file in the system cache. This means most of the time the file read wont go near the disk, but will read the portion of the file from the system cache.

Being able to concurrently access this file will significantly improve performance of certain algorithms in my code.

So, there are two questions here:

  1. Is it possible for windows to concurrently access the same file from different threads?
  2. If so, how do you provide this ability? I've tried creating the temp file and opening the file again to provide two file handles, but the second open does not succeed.

Here's the create:

FFileSystem := CreateFile(PChar(FFileName),
                          GENERIC_READ + GENERIC_WRITE,
                          FILE_SHARE_READ + FILE_SHARE_WRITE,
                          nil,
                          CREATE_ALWAYS,
                          FILE_ATTRIBUTE_NORMAL OR
                          FILE_FLAG_RANDOM_ACCESS OR
                          FILE_ATTRIBUTE_TEMPORARY OR
                          FILE_FLAG_DELETE_ON_CLOSE,
                          0);

Here's the second open:

FFileSystem2 := CreateFile(PChar(FFileName),
                          GENERIC_READ,
                          FILE_SHARE_READ,
                          nil,
                          OPEN_EXISTING,
                          FILE_ATTRIBUTE_NORMAL OR
                          FILE_FLAG_RANDOM_ACCESS OR
                          FILE_ATTRIBUTE_TEMPORARY OR
                          FILE_FLAG_DELETE_ON_CLOSE,
                          0);

I've tried various combinations of the flags with no success so far. The second file open always fails, with messages to the affect that the file cannot be accessed as it is in use by another process.

Edit:

OK, some more information (I was hoping to not get lost in the weeds here...)

The process in question is a Win32 server process running on WinXP 64. It's maintaining large spatial databases and would like to keep as much of the spatial database as possible in memory in an L1/L2 cache structure. L1 already exists. L2 exists as a 'temporary' file that stays in the windows system cache (it's somewhat of a dirty trick, but gets around win32 memory limitations somewhat). Win64 means I can have lots of memory used by the system cache so memory used to hold the L2 cache does count towards process memory.

Multiple (potentially many) threads want to concurrently access information contained in the L2 cache. Currently, access is serialised, which means one thread gets to read it's data while most (or the rest) of the threads are blocked pending completion of that operation.

The L2 cache file does get written to, but I'm happy to globally serialise/interleave read and write type operations as long as I can perform concurrent reads.

I'm aware there are nasty potential thread concurrency issues, and I'm aware there are dozens of ways to skin this cat in other contexts. I have this particular context, and I'm trying to determine if there is a way to permit concurrent thread read access within the file and within the same process.

Another approach I have considered would be two split the L2 cache into multiple temporary files, where each file serialises thread access the way the current single L2 cache file does.

And yes, this somewhat desparate approach is because 64 bit Delphi wont be with us any time soon :-(

Thanks, Raymond.

+1  A: 

I need to access a file concurrently with multiple threads. This needs to be done concurrently, without thread serialisation for performance reasons.

Either you don't need to use the same file within different threads, or you do need some kind of serialization.

Otherwise, you're just setting yourself up for heartache down the road.

Eric H.
Given that I'm only intending reading the file in concurrnet contexts, I'm interested in that you think the heartache is.Currently my code does serialise access into the file, this is what I'm trying to do without :-)If the file was a disk resident file, then I'd be inclined not to bother due to the natural serialisation imposed by the physical disk. However, this not the case, which is why I'm trying...
Raymond Wilson
If you don't serialize it, you're almost certain to run into problems with something being half written or half read. And I can promise this will happen late at night, a week before a big demo. You will never be able to reproduce the error, until you're in the middle of that big demo.Seriously, I'm trying to help you out here. Either you need concurrent access (with some kind of serialization, lock-free or not), or you don't.If you need concurrent access, but skimp on the serialization, you will regret it later.
Eric H.
I appreciate your concern, and I'm all too aware of the potential pitfalls with thread concurrency (and I agree, they usually occur late at night)!So, if we restrict ourselves to the reading case, are you suggesting that concurrent reads are not supported in the the OS? I'm aware that attempting to do concurrent reads with the same file handle is definitely begging for trouble, which is why part of my Q revolves around how to either open the file multiple times (or perhaps just clone the file handle) so the concurrent reads don't step on each others toes while being serviced by windows.
Raymond Wilson
+2  A: 

You can do on that way...

First thread with read/write access must at first create file:

FileHandle := CreateFile(
  PChar(FileName),
  GENERIC_READ or GENERIC_WRITE,
  FILE_SHARE_READ,
  nil,
  CREATE_ALWAYS,
  FILE_ATTRIBUTE_NORMAL,
  0);

Sencond thread with only read access then opens the same file:

  FileHandle := CreateFile(
    PCHar(FileName),
    GENERIC_READ,
    FILE_SHARE_READ + FILE_SHARE_WRITE,
    nil,
    OPEN_EXISTING,
    FILE_ATTRIBUTE_NORMAL,
    0);

I didn't test if works with...

FILE_ATTRIBUTE_TEMPORARY,
FILE_FLAG_DELETE_ON_CLOSE

attributes...

GJ
I tried your suggestion, but I still get the SysErrorMessage of 'The process cannot access the file because it is being used by another process'. Perhaps this is not a valid action with the temporary and delete on close flags...
Raymond Wilson
In my application works well. Main process read the file which is created by thread. I'm also using FlushFileBuffers(FileHandle) in thread!
GJ
Have you tried adding the other flags to the first CreateFile call?
Raymond Wilson
Yes I have tested also with this attributes right now and works well! Do as short as posible test case it must work. The second one (reader) must have normal attributes as in my upper case.
GJ
You were correct, but it seems the key issue was the absence of the FILE_SHARE_DELETE flag on the second file open as pointed out by Rob Kennedy.
Raymond Wilson
+3  A: 

Update #2

I wrote some test projects in C to try and figure this out- although Rob Kennedy beat me to the answer while I was away. Both conditions are possible, including cross-process, as he outlines. Here's a link if anyone else would like to see this in action.

SharedFileTests.zip (VS2005 C++ Solution) @ meklarian.com

There are three projects:

InProcessThreadShareTest - Test a creator and client thread.
InProcessThreadShareTest.cpp Snippet @ gist.github

SharedFileHost - Create a host that runs for 1 minute and updates a file.
SharedFileClient - Create a client that runs for 30 seconds and polls a file.
SharedFileHost.cpp and SharedFileClient.cpp Snippet @ gist.github

All of these projects assume the location C:\data\tmp\sharetest.txt is creatable and writable.


Update

Given your scenario, sounds like you need a very large chunk of memory. Instead of gaming the system cache, you can use AWE to have access to more than 4Gb of memory, although you will need to map portions at a time. This should cover your L2 scenario as you wish to ensure that physical memory is used.

Address Windowing Extensions @ MSDN

Use AllocateUserPhysicalPages and VirtualAlloc to reserve memory.

AllocateUserPhysicalPages Function (Windows) @ MSDN
VirtualAlloc Function (Windows) @ MSDN


Initial

Given that you are using the flag FILE_FLAG_DELETE_ON_CLOSE, is there any reason you wouldn't consider using a memory-mapped file instead?

Managing Memory-Mapped files in Win32 @ MSDN

From what I see in your CreateFile statements, it appears that you want to share data across-thread or across-process, with regard only to having the same file present while any sessions are open. A memory mapped file allows you to use the same logical filename in all sessions. Another benefit is that you can map views and lock portions of the mapped file with safety across all sessions. If you have a strict server with N-client scenario, it should be easy to implement. If you have a case where any client may be the opening server, you may wish to consider using some other mechanism to ensure that only one client gets to initiate the serving file first (via a global mutex, perhaps).

CreateMutex @ MSDN

If you only need one-way transmission of data, perhaps you could use named pipes instead.
(edit) This is best for 1 server to 1 client.

Named Pipes (Windows) @ MSDN

meklarian
I want to share data (well, access to the file anyway) across threads in the same process (a server). I've considered using memory mapped files, however the quantities of data involved on top of memory used by the Win32 server process itself don't make this practical.
Raymond Wilson
If all the threads are in the same process, you could simply pass the file handle around to all the threads that need it. There aren't any restrictions on reusing the file handle in the same process. However, as Eric H. mentions, all bets are off if you aren't serializing access to the file. You can use LockFile/UnlockFile to manually restrict views, but this also may not be desirable for your situation.
meklarian
Current the threads all use the same file handle anyway, they are just serialised in their use of it.Currently none of the file reads will overlap, though I would not have thought that would be a problem anyway.
Raymond Wilson
Regards AWE (and PAE). These would work, however, the OSes we support (eg: WinXP) do not allow us to use it as an option. Yes, I'm gaming the system. I don't really have a choice. :-(
Raymond Wilson
Raymond, AWE and PAE were both introduced in Windows 2000; Windows XP should not prevent you from using them. (Maybe there *is* something that bars those options for you, but merely Windows XP isn't it.)
Rob Kennedy
Hmm... IIRC, the limitations of accessible memory in WinXP via AWE are quite restricted (4GB?). I've seen documentation in the past that outlines the precise limitations and restrictions on AWE (basically, unless you were using Windows Server it wasn't a lot of use), but I can't track it down now :-(
Raymond Wilson
Documentation on PAE-based AWE support is here. XP is limited to 4GB, Win2k is limited to 4GB except the Datacenter edition, and Plain Win2k3 Server is limited to 4GB. Everything else has higher limits. No mention of 64-bit limitations though. http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
meklarian
+6  A: 

Yes, it's possible for a program to open the same file multiple times from different threads. You'll want to avoid reading from the file at the same time you're writing to it, though. You can use TMultiReadExclusiveWriteSynchronizer to control access to the entire file. It's less serialized than, say, a critical section. For more granular control, take a look at LockFileEx to control access to specific regions of the file as you need them. When writing, request an exclusive lock; when reading, a shared lock.

As for the code you posted, specifying File_Share_Write in the initial sharing flags means that all subsequent open operations must also share the file for writing. Quoting from the documentation:

If this flag is not specified, but the file or device has been opened for write access or has a file mapping with write access, the function fails.

Your second open request was saying that it did not want anybody else to be allowed to write to the file while that handle remained open. Since there was already another handle open that did allow writing, the second request could not be fulfilled. GetLastError should have returned 32, which is Error_Sharing_Violation, exactly what the documentation says should happen.

Specifying File_Flag_Delete_On_Close means all subsequent open requests need to share the file for deletion. The documentation again:

Subsequent open requests for the file fail, unless the FILE_SHARE_DELETE share mode is specified.

Then, since the second open request shares the file for deletion, all other open handles must have also shared it for deletion. The documentation:

If there are existing open handles to a file, the call fails unless they were all opened with the FILE_SHARE_DELETE share mode.

The bottom line is that either everybody shares alike or nobody shares at all.

FFileSystem := CreateFile(PChar(FFileName),
  Generic_Read or Generic_Write
  File_Share_Read or File_Share_Write or File_Share_Delete,
  nil,
  Create_Always,
  File_Attribute_Normal or File_Flag_Random_Access
    or File_Attribute_Temporary or File_Flag_Delete_On_Close,
  0);

FFileSystem2 := CreateFile(PChar(FFileName),
  Generic_Read,
  File_Share_Read or File_Share_Write or File_Share_Delete,
  nil,
  Open_Existing,
  File_Attribute_Normal or File_Flag_Random_Access
    or File_Attribute_Temporary or File_Flag_Delete_On_Close,
  0);

In other words, all the parameters are the same except for the fifth one.

These rules apply to two attempts to open on the same thread as well as attempts from different threads.

Rob Kennedy
Funnily enough - my first version of the code was very similar, except I did not specify File_Share_Delete (which seems obvious now ;-) ). The code I placed in the initial question was, in hindsight and your explanation, quite wrong. I'll give that a spin and see how it goes.
Raymond Wilson
+1, wrote some test projects and verified this in C. Updating my answer w/a link to a download for my test projects. Works reliably across processes too.
meklarian
I added the FILE_SHARE_DELETE flag to the first create file and copied for the second and hey presto that works!
Raymond Wilson