tags:

views:

3953

answers:

10

With very large amounts of ram these days I was wondering, it is possible to allocate a single chunk of memory that is larger than 4GB? Or would I need to allocate a bunch of smaller chunks and handle switching between them?

Why??? I'm working on processing some openstreetmap xml data and these files are huge. I'm currently streaming them in since I can't load them all in one chunk but I just got curious about the upper limits on malloc or new.

+9  A: 

it depends on which C compiler you're using, and on what platform (of course) but there's no fundamental reason why you cannot allocate the largest chunk of contiguously available memory - which may be less than you need. And of course you may have to be using a 64-bit system to address than much RAM...

see Malloc for history and details

call HeapMax in alloc.h to get the largest available block size

Steven A. Lowe
+15  A: 

Short answer: Not likely

In order for this to work, you absolutely would have to use a 64-bit processor. Secondly, it would depend on the Operating System support for allocating more than 4G of RAM to a single process.

In theory, it would be possible, but you would have to read the documentation for the memory allocator. You would also be more susceptible to memory fragmentation issues.

There is good information on Windows memory management.

Benoit
32 bit Intel processors actually have 36bit addresses for 64Gb memory - it's just the licences for desktop versions of Windows limit you to 4Gb. Linux/BSD can access 64Gb on a 32bit cpu.
Martin Beckett
Afaik even then you are still stuck with a 3GB process limit.
Marco van de Voort
+7  A: 

Have you considered using memory mapped files? Since you are loading in really huge files, it would seem that this might be the best way to go.

1800 INFORMATION
+10  A: 

This shouldn't be a problem with a 64-bit OS (and a machine that has that much memory).

If malloc can't cope then the OS will certainly provide APIs that allow you to allocate memory directly. Under Windows you can use the VirtualAlloc API.

Rob Walker
+5  A: 

It depends on whether the OS will give you virtual address space that allows addressing memory above 4GB and whether the compiler supports allocating it using new/malloc.

For 32-bit Windows you won't be able to get single chunk bigger than 4GB, as the pointer size is 32-bit, thus limiting your virtual address space to 4GB. (You could use Physical Address Extension to get more than 4GB memory; however, I believe you have to map that memory into the virtualaddress space of 4GB yourself)

For 64-bit Windows, the VC++ compiler supports 64-bit pointers with theoretical limit of the virtual address space to 8TB.

I suspect the same applies for Linux/gcc - 32-bit does not allow you, whereas 64-bit allows you.

Franci Penov
Just curious, where'd you get 8TB from? The intel docs give a 48-bit actual address size, which gives a 256TB address space.
Branan
I should've phrased it not as "theroretical". The actual number comes from Memory Limits for Windows Releases - http://msdn.microsoft.com/en-us/library/aa366778(VS.85).aspx
Franci Penov
+18  A: 

The advantage of memory mapped files is that you can open a file much bigger than 4Gb (almost infinite on NTFS!) and have multiple <4Gb memory windows into it.
It's much more efficent than opening a file and reading it into memory,on most operating systems it uses the built-in paging support.

Martin Beckett
what does "almost much more efficient" mean?
andy
Sorry I changed the sentance and left in an extra word.
Martin Beckett
A memory mapped files avoids the whole copying of the file into committed memory - probably most of it back to the paging file - because it just uses the original file as a backing store for the committed memory.
QBziZ
The mathematician in me wonders what "almost infinite" means..
pauldoo
The maximum file size on NTFS is 2^64 which isn't infinite, But is close enough to go around for drinks!
Martin Beckett
The maximum file size in NTFS is much less than that actually mgb. http://en.wikipedia.org/wiki/NTFS. Theoretical: 16 EiB minus 1 KiB (264 − 210 bytes). Implementation: 16 TiB minus 64 KiB (244 − 216 bytes)
1800 INFORMATION
+14  A: 

A Primer on physcal and virtual memory layouts

You would need a 64-bit CPU and O/S build and almost certainly enough memory to avoid thrashing your working set. A bit of background:

A 32 bit machine (by and large) has registers that can store one of 2^32 (4,294,967,296) unique values. This means that a 32-bit pointer can address any one of 2^32 unique memory locations, which is where the magic 4GB limit comes from.

Some 32 bit systems such as the SPARCV8 or Xeon have MMU's that pull a trick to allow more physical memory. I won't go into the details but This presentation (warning: powerpoint) describes how this works. However, for a single process looking at a virtual address space, only 2^32 distinct physical locations can be mapped by a 32 bit pointer. Some O/S's have facilities (such as those described Here - thanks to FP above) to manipulate the MMU and swap different physical locations into the virtual address space under user level control.

The operating system and memory mapped I/O will take up some of the virtual address space, so not all of that 4GB is necessarily available to the process. As an example, Windows defaults to taking 2GB of this, but can be set to only take 1GB if the /3G switch is invoked on boot. This means that a single process on a 32 bit architecture of this sort can only build a contiguous data structure of somewhat less than 4GB in memory.

This means you would have to explicitly use the PAE facilities on Windows or Equivalent facilities on Linux to manually swap in the overlays. This is not necessarily that hard, but it will take some time to get working.

Alternatively you can get a 64-bit box with lots of memory and these problems more or less go away. A 64 bit architecture with 64 bit pointers can build a contiguous data structure with as many as 2^64 (18,446,744,073,709,551,616) unique addresses, at least in theory. This allows larger contiguous data structures to be built and managed. If getting 64-bit hardware is an option (and it can be done relatively cheaply if you know where to look) then you might find getting a 64-bit box to be cheaper than building the overlay manager.

How to buy a 64-bit machine with lots of memory on the cheap

Several widely available architectures such as SPARC, Opteron, various flavours of POWER and more recent Xeons are natively capable of running in a 64-bit mode. Windows is a little behind the times - XP 64 bit was never really a success but Vista or Windows 2008 server (See This Link for a detailed treatise on using Windows 2008 server as a workstation O/S) does come in 64 bit flavours. Linux, MacOS and Solaris also have mature support for 64 bit platforms - I'm writing this on an Opteron box running FC7 X64.

If you want to buy a 64-bit computer on the cheap, Ebay is your friend. Opteron workstations such as the HP XW9300, SPARC based systems like the Sun Blade 2500 or PPC64 based Powermacs can be purchased on Ebay for modest prices - especially compared to their list prices when new. 2GB, 4GB or even 8GB DDR memory kits can also be obtained at large discounts off retail from Ebay. Get a nice fast disk or two while you're at it.

Note: make sure your machine takes standard memory, Sun Blade 1000 and 2000 systems don't.

This is a little more than just pontification. About 18 months ago I had occasion to buy some XW9300's when I was working with Analysis Services 2005. This really likes a 64-bit box - the recommended memory configuration for a production server is 6-8GB - and I was having trouble with cube builds running out of memory. I have been through the exercise of some false starts (Intellistation A Pro's that were unreliable to the point of being nearly unusable) and a not inconsiderable amount of wasted money. The XW9300's I got in the end are great. They've put up with my ham-fisted hardware installation, quite a bit of dust and some fairly rough handling in transit.

Buying this sort of kit new is eye-wateringly expensive, but most of the people who buy it are media types who want the latest and greatest CPU's or people running specific graphics applications where the vendor insists on certified hardware as a part of the T&C's for support. Media types tend to turn the boxes over quite quickly and secondhand ones are relatively cheap. Memory for older models (DDR1 in this case) loses its sex appeal and tends to surface on the secondary market at large discounts (sometimes 80% or more) off list price. There's also quite a substantial secondary market in SCSI disks through outfits like scsi4me.com.

I found HP's internet and phone service quite OK when ordering miscellaneous parts like air ducts, fans and heat sinks for the XW9300's as I built them. There's also quite a substantial third-party market in outfits who buy secondhand ones at auction and strip them for parts. A quick troll through Ebay turned up several XW9300's for $US600-$1000, one for about $1000 with 8GB of RAM.

This stackoverflow posting has a more in depth discussion of buying xeon and opteron based workstation systems.

ConcernedOfTunbridgeWells
Or you could just pick up simple AMD Phenom or Intel Quadcore. I recently put together a quad core machine with 8 GB of ram and 4x500GB hard drives for just over $1000 CDN. You can get a lot of power for cheap as long as you keep the video card out of the picture.
Kibbee
Consumer grade kit will always be cheaper but most of the XW9300's I've bought are still in warranty. I just got one for £225, 16GB of registered memory for £300, 6x72GB 15k drives for £55 each and an adaptec RAID controller for £120.
ConcernedOfTunbridgeWells
(contd.) it's a question of scale. With the depressed memory prices now, you can get 64GB of DDR2 for about £2000. This would let you build an XW8600 or similar machine with 64GB for maybe £3-3.5k.
ConcernedOfTunbridgeWells
A: 

Like everyone else said, getting a 64bit machine is the way to go. But even on a 32bit machine intel machine, you can address bigger than 4gb areas of memory if your OS and your CPU support PAE. Unfortunately, 32bit WinXP does not do this (does 32bit Vista?). Linux lets you do this by default, but you will be limited to 4gb areas, even with mmap() since pointers are still 32bit.

What you should do though, is let the operating system take care of the memory management for you. Get in an environment that can handle that much RAM, then read the XML file(s) into (a) data structure(s), and let it allocate the space for you. Then operate on the data structure in memory, instead of operating on the XML file itself.

Even in 64bit systems though, you're not going to have a lot of control over what portions of your program actually sit in RAM, in Cache, or are paged to disk, at least in most instances, since the OS and the MMU handle this themselves.

ebencooke
A: 
RandomNickName42
+2  A: 

If size_t is greater than 32 bits on your system, you've cleared the first hurdle. But the C and C++ standards aren't responsible for determining whether any particular call to new or malloc succeeds (except malloc with a 0 size). That depends entirely on the OS and the current state of the heap.

Dan Olson