tags:

views:

291

answers:

7

Does anyone have any suggestions (product, toolsets, methods or other) for the storage and processing of custom data (delphi collections, binary trees, DIContainers etc) that DOES NOT restrict itself to a standard win32 memory address space? To put that in the extreme, is there anything off the shelf that can do the equivalent of holding a 10GB TList, thereby blowing the /3GB switch barrier and the 4GB 'windows on windows' limit?

What we ideally need is something that is pretty transparent to the Delphi application programmer, but allows very fast access to the data held in its structures, preferably via key lookup. The equivalent of a delphi colletion container would be fine, but its memory usage needs to be via AWE. It would also need to take care of mapping and unmapping the physical space it uses into the win32 process making use of it i.e. that would be the transaprent bit...

Moving the data into a database is not the answer - the information needs to remain memory resident for very fast access. The in-memory databases/tables that we've tried do not make use of AWE and also are slow at accessing. Our current Delphi data structures are fine, but straining the limits of win32 address space.

A: 

There are system calls that can do this but it is not supported on all versions of Windows (in particular, Windows XP does not support AWE).

Transparency would be something of an issue as the API could not return pointers to objects. Mapping more than 4GB of RAM into a 4GB address space means that a 32 bit pointer could be ambiguous - you could potentially map different objects into the same location.

This ambiguity means that you would have to generate proxies for the objects which hold a handle that could be used to access the 'record'. Some SQL server versions use this technique to store disk buffers in AWE memory. An approach like this would probably work for something like rows in a matrix where the operations are done on the whole row. Finer grained access would be more fiddly.

In order to provide direct access to the mapped object you would have to implement a protocol where a temporary pointer to the mapped memory was made available. This would also require the object to be locked in memory while in use - again, bang goes your transparency.

Assuming you can get a 64 bit version of Delphi now you might be better off going to a 64 bit version of Windows for customers that need more RAM.

ConcernedOfTunbridgeWells
I think you are pretty much on the track of what we are needing to do. The 4GB address space shouldn't be an issue in our case, we need large amounts of data in the non 2GB range, but only small amounts of it at any one time into mappable range. The locking in memory I don't see as an issue either. Data is only available to the one main thread and is processed sequentially and mechanically (complex but standard data processing task) If only there were a 64bit Delphi now... FreePascal would be an option, but we'd also need 64 bit MS SQL access, which I don't think currently exists...
Paul
I'm sure SQL Server has a 64 bit OLEDB driver if you don't mind using OLEDB for database access.
ConcernedOfTunbridgeWells
AFAIK Windows XP does support AWE (see minumum requirements at http://msdn.microsoft.com/en-us/library/aa366753(VS.85).aspx), but it is of little use because 32-bit XP does not support more than 4GB of physical RAM - don't know if it can be used to access more than 4GB of memory on XP/Vista 64 from a 32 bit application.
ldsandon
A: 

You state that you do not want to move to a database, but what about a database that specifically uses AWE?

I've not tried it personally, but would consider using products from this company for my own projects.

[Edit]: NexusDB is Delphi-friendly: it originated from the old Turbopower FlashFiler development (but has moved on a long way since then).

IanH
I had given Nexus a very quick look, but I've not actually tried it out. In fact I had missed the fatc that it has an AWE version availalbe. I will try it, but I'm very dubious about it being applicable to our need i.e. access speeds equaivalent to a TList.Find etc... Our data is already read and written to MSSQL server, we process in memory for speed. Lots of keyed lookups and trawls through records - the sort of processing you would normally use a database for if speed were NOT an issue...
Paul
A: 

The issue with AWE it works very much alike the old, DOS-based EMS and XMS - if you ever used them. Basically, a range of addressable memory is reserved, and the memory outside the addressable range is then mapped to the addressable range when needed, and unmapped when no longer need, allowing other memory to be mapped at the same addresses. Thereby most non-AWE aware data structures or containers wouldn't work in such a scenario - probably a TMemoryStream descendant is easier to build. It should be easy enough to build a TList or the like that store data in AWE memory, it should keep track where the data are really stored and recall them when needed, adjusting addresses as well when data are mapped to addressable memory. I am not aware of any Delphi containers library using AWE, and there is another issue: desktop 32 bit operating systems can't use more than 4GB of physical RAM, a server version would be required, and the supported physical RAM depends on what version is used, see here for a complete list.

ldsandon
Yes, this is exactly as I see and understand it. The version of windows is not the issue - we are talking about a solution for a commercial product where we already throw reasonably specified Windows 2003 servers at the job. The issue does boil down as you described it to a TList type container that can map / unmap memory in via AWE - in a similar was to DOS EMS as you put it.
Paul
When using AWE basically you need two data structure, some kind of "directory" of where data reside, and the actual data. When accessing data, it should check if data are mapped and bring them into the addressable memory before using it. Application should be aware that pointers to this data may not be always valid, and there should be a way to "lock/unlock" data while in use.Also, because AWE allocates memory using pages (4k on x86), not bytes, it's best suited for allocating data with that granularity, unless the access class is smart enough to suballocate them.
ldsandon
I think you have a little more clarity on this than I do - my thoughts have been cloudy but in a similar place to what you are describing. Ideally we'd happily to pay a good price for someone to write a commerical component and not have to even think about it. As you say the granularity of 4K is an issue and I think suballocation may be necessary for the general case. For our purposes, you might get away with 4K allocations, but invariably we'd no doubt need 4 and a bit and so waste virtually all of the second 4K block allocated.
Paul
A: 

Assuming the data is loaded once in bulk and fits available memory, NexusDB AWE will be very very fast. The database can be created as an in-memory only DB and will then not need any further harddrive access while manipulating.

NexusFan
I am going to give the trial NexusAWE a look today. I'm sure it is a great product, but I do have my doubts about its responsiveness. It will not doubt beat traditional SQL databases on speed, but it has to struggle against in-process memory resident data structures like TLists?!
Paul
+2  A: 

I'm going to be a complete dork, and tell you that I've made something even more advanced than what you're describing.... at work. So it's all closed source I'm afraid. Never saw anything like this anywhere. We combine VM, AWE, MMF and (soon) 32<>64 bit IPC into one big, mean data-processing machine, addressing up to 64 GB of memory, while processing hundreds of datasets, tens of GBs each.

But I can give you a few tips : AWE view-swapping is rather slow, because it forcibly pauses all running threads during the swap. Therefor, choose your window-sizes wisely (the smaller, the faster the swap - but call-overhead is lower with larger sizes ofcourse). We've settled with AWE view-sizes equal to the Windows default page-size (4 KB), but only because random-access performs best that way. Lineair data-access could run faster with bigger view-sizes.

Each view can map to any part of the allocated AWE memory, so one thing that can help is mapping only those pages into a view that need to be accessed - and try to save on unnessecary view-swaps (a priority-queue comes to mind).

Also, there should be a registration-mechanism somewhere in your design that handles the linkage between a view and the AWE memory behind this. And this better be thread-safe!

As for general usage : No, this doesn't fit in with regular Delphi classes. You should switch over to another concept altogether - and base your data-structures on that.

Anyway, good luck mate! You're going to need it... ;-)

PatrickvL
Thanks Patrick, I think! If you are were willing to share some mor e detail on the basic techniques that would be great, but looking at your product I can tell that this is the crown jewels so it's understandable that you stay tight lipped. Just as an aside, what you're doing with your product is very similar to what we're doing, and (I'm going to make it even less likely that you'll share now!) you're working in broadly the same market as we do. Years ago I was a customer of a far sigthed MRP tool provider processing data in this manner, I've forever since tried to use such techniques.
Paul
The problem as you know only too well is memory space. All well and good when the host data is small and can fit inside the 3Gb limit (or 4 on WoW). We started our tool with the aim of addressing small customers only. However, we've been tackling increasing large customers over the 2 - 3 years and are now always banging our head on the 3Gb limit.It's all a bit of a pain really! 64bit Delphi would have waved the magic wand for us, but with Embarcaderos ever stretching delivery dates we're left looking at MMF / AWE techniques I guess. Nice to know that at least somebody has taken this route.
Paul
A: 

Sounds to me like you guys might consider dropping the current database SQL backend and going to a 100% NexusDB + AWE solution.

(Or rather, dropping the day to day access to the SQL backend, and having an export/sync function that can write out any required NexusDB reporting data to an MSSQL reporting db.)

W

Warren P
I can't see us dropping SQL as the main data repository, but I have considered looking at Nexus + AWE. I haven't had chance to evaluate it yet, but my gut feeling is that we'd need them to license an in-process version as a "standard" database that is AWE enabled won't cut it - we could use SQL server in that way...
Paul
A: 

Your situation sounds similar to ours, our application uses a huge datafile that we store in a memory-mapped file. The files are around 750MB, and we allocate data structures from them that use up to 1.5GB of RAM.

We have found no solution to the 4GB limit other than moving some of it off to FPC/Lazarus until Delphi is 64-bit, unfortunately. AWE does not work with Vista Home versions, also we couldn't get it to work with MMFs.

You could try memory-mapped files with a sliding window, meaning you dynamically create views of different chunks of the file depending on what part of it the application is using. Sounds like that won't work though because you need the entire file in memory at once.

Alan Clark
Thanks Alan. Your last sentance captures the reality - we need access to everything. However I could maybe see something akin using hash tables with memory mapped files. It would be a complete rewrite of our process but may work. By the time we'd done that 64bit "will" be available. I know that "will" is somewhat hopeful and optimistic, but we have several possible directions to go in as a businesss, one of which is to move off Delphi altogether due to the continual chopping and changing of roadmap priority for 64bit.
Paul