tags:

views:

10999

answers:

9

Is there a max length for an array in C++?

Is it a C++ limit or does it depend on my machine? Is it tweakable? Does it depend on the type the array is made of?

Can I break that limit somehow or do I have to search for a better way of storing information? And what should be the simplest way?

What I have to do is storing long long int on an array, I'm working in a Linux environment. My question is: what do I have to do if I need to store an array of N long long integers with N > 10 digits?

I need this because I'm writing some cryptographic algorithm (as for example the p-Pollard) for school, and hit this wall of integers and length of arrays representation.

A: 

Max array length depends on the size of an int on your hardware, I would say. Most of the time this would be 32 bit in this day.

Tomalak
it's size_t not int, for example on x86_64 systems sizeof(int) < sizeof(size_t)
quinmars
In this day? Are you living in 2002? We're in 2008 and 64bit is reality, at least for PC market.
Milan Babuškov
@Milan Bubyuskov, I doubt the percentage of PC users running a 64 bit O/S is actually that high as yet. My guess would be less than 10%.
Shane MacLaughlin
I would second that. And 64bit hardware != 64bit OS, the majority of 64 bit processors run 32bit platforms today, as far as home computing goes.
Tomalak
In simple machines, the memory limit is 2^n - 1, where N is the number of bits in the address space. However, the OS, *BIOS* and other programs can eat up this space. Some processors may have a 64 bit address space but gag when trying to go past 2GB or 4GB due to system components (such as the motherboard).
Thomas Matthews
@Shane: I don't know about in 2008 but now in June 2010 64-bit OS use among Steam users is over 30%. See: http://store.steampowered.com/hwsurvey/
Zan Lynx
@Zan - changing over is pretty smooth alright, with reasonably good 32 bit support for transition / legacy apps and low memory costs, the reasons for not putting a 64bit OS on a new box are few.
Shane MacLaughlin
+32  A: 

There are two limits, both not enforced by C++ but rather by the hardware.

The first limit (should never be reached) is set by the restrictions of the size type used to describe an index in the array (and the size thereof). It is given by the maximum value the system's std::size_t can take. This data type should always be the largest integer type of a system.

The other limit is a physical memory limit. The larger your objects in the array are, the sooner this limit is reached because memory is full. For example, a vector<int> of a given size n typically takes about four times as much memory as an array of type vector<char> (minus a small constant value). Therefore, a vector<char> may contain more items than a vector<int> before memory is full. The same counts for the native C-style arrays int[] and char[].

Additionally, this upper limit may be influenced by the type of allocator used to construct the vector because an allocator is free to manage memory any way it wants. A very odd but nontheless conceivable allocator could pool memory in such a way that identical instances of an object share resources. This way, you could insert a lot of identical objects into a container that would otherwise use up all the available memory.

Apart from that, C++ doesn't enforce any limits.

Konrad Rudolph
Also you can normally easily hit stack size limits, especially if using threads which again is implementation specific (but able to be changed).
Alaric
@Alaric: True. I didn't want to go too deep into system specifics because they differ very much and I'm no expert in any of them.
Konrad Rudolph
@Konrad, interesting point about allocator types and not something I was aware of. Thanks for the info.
Shane MacLaughlin
std::size_t is usually (always?) the size of a pointer, not the size of the biggest integer that has native hardware support in the integer math unit. On every x86 OS I've used, size_t is 32-bits for a 32-bit OS and 64-bits for a 64-bit OS.
Mr Fooz
My understanding is that the maximum limit of an *array* is the maximum value of the processor's *word*. This is due to the indexing operator. For example, a machine may have a word size of 16 bits but an addressing register of 32 bits. A chunk of memory is limited in size by the parameter passed to `new` or `malloc`. A chunk of memory larger than an array can be accessed via pointer.
Thomas Matthews
Assuming wikipedia is correct ( http://en.wikipedia.org/wiki/Word_(computing) ), on x86, a word is 16-bits (and this is consistent with Microsoft's terminology at very least). I wonder what size_t would be for real mode 8088 (since the address space is 20 bits, it uses a 32-bit representation, but standard pointer arithmetic only works in 16 bit chunks).
Mr Fooz
+3  A: 

I would agree with the above, that if you're intializing your array with

 int myArray[SIZE]

then SIZE is limited by the size of an integer. But you can always malloc a chunk of memory and have a pointer to it, as big as you want so long as malloc doesnt return NULL.

Tarski
+4  A: 

Looking at it from a practical rather than theoretical standpoint, on a 32 bit Windows system, the maximum total amount of memory available for a single process is 2 GB. You can break the limit by going to a 64 bit operating system with much more physical memory, but whether to do this or look for alternatives depends very much on your intended users and their budgets. You can also extend it somewhat using PAE.

The type of the array is very important, as default structure alignment on many compilers is 8 bytes, which is very wasteful if memory usage is an issue. If you are using Visual C++ to target Windows, check out the #pragma pack directive as a way of overcoming this.

Another thing to do is look at what in memory compression techniques might help you, such as sparse matrices, on the fly compression, etc... Again this is highly application dependent. If you edit your post to give some more information as to what is actually in your arrays, you might get more useful answers.

Edit: Given a bit more information on your exact requirements, your storage needs appear to be between 7.6 GB and 76 GB uncompressed, which would require a rather expensive 64 bit box to store as an array in memory in C++. It raises the question why do you want to store the data in memory, where one presumes for speed of access, and to allow random access. The best way to store this data outside of an array is pretty much based on how you want to access it. If you need to access array members randomly, for most applications there tend to be ways of grouping clumps of data that tend to get accessed at the same time. For example, in large GIS and spatial databases, data often gets tiled by geographic area. In C++ programming terms you can override the [] array operator to fetch portions of your data from external storage as required.

Shane MacLaughlin
There are system calls that allow allocation of memory outside the program space; but this is OS dependent and not portable. We used them in embedded systems.
Thomas Matthews
+2  A: 

One thing I don't think has been mentioned in the previous answers.

I'm always sensing a "bad smell" in the refactoring sense when people are using such things in their design.

That's a huge array and possibly not the best way to represent your data both from an efficiency point of view and a performance point of view.

cheers,

Rob

Rob Wells
Have you got any suggestion on what I should use?
luiss
If you can tell us what the data is that you're storing then maybe we can. (-:
Rob Wells
Sorry Luis my first response was very flippant.It will be driven by the nature of your data. The relationaships of your data will drive the model you use to represent the data. Then the collection *should* be apparent from that. If not, I would worry about the data model.
Rob Wells
A: 

As has already been pointed out, array size is limited by your hardware and your OS (man ulimit). Your software though, may only be limited by your creativity. For example, can you store your "array" on disk? Do you really need long long ints? Do you really need a dense array? Do you even need an array at all?

One simple solution would be to use 64 bit Linux. Even if you do not physically have enough ram for your array, the OS will allow you to allocate memory as if you do since the virtual memory available to your process is likely much larger than the physical memory. If you really need to access everything in the array, this amounts to storing it on disk. Depending on your access patterns, there may be more efficient ways of doing this (ie: using mmap(), or simply storing the data sequentially in a file (in which case 32 bit Linux would suffice)).

ejgottl
Hmm, disks, arrays, ... anybody hear of *virtual memory*. OSes that support *virtual memory* will start using an external device for memory, such as a hard disk, and swap out chunks with internal memory.
Thomas Matthews
+18  A: 

Nobody mentioned the limit on the size of the stack frame.

There are two places memory can be allocated:

  • On the heap (dynamically allocated memory).
    The size limit here is a combination of available hardware and the OS's ability to simulate space by using other devices to temporarily store unused data (i.e. move pages to hard disk).
  • On the stack (Locally declared variables).
    The size limit here is compiler defined (with possible hardware limits). If you read the compiler documentation you can often tweak this size.

Thus if you allocate an array dynamically (the limit is large and described in detail by other posts.

int a1 = new int[SIZE];  // SIZE limited only by OS/Hardware

Alternatively if the array is allocated on the stack then you are limited by the size of the stack frame. N.B. vectors and other containers have a small presence in the stack but usually the bulk of the data will be on the heap.

int a2[SIZE]; // SIZE limited by COMPILER to the size of the stack frame
Martin York
Valid and important point.
Konrad Rudolph
Preferred allocation of large arrays is not on a stack or globally defined but rather through dynamic allocation (via `new` or `malloc`).
Thomas Matthews
@Thomas Matthews: Not in my world. Dynamically allocated objects require management. If it needs to dynamically allocated I would use a stack object that representes the dynamically allocated memoory, like a std::vector.
Martin York
A: 

I would really like to know is for what do you need a vector with ...

store an array of N integers with N > 10 digits

João Augusto
+1  A: 

If you have to deal with data that large you'll need to split it up into manageable chunks. It won't all fit into memory on any small computer. You can probably load a portion of the data from disk (whatever reasonably fits), perform your calculations and changes to it, store it to disk, then repeat until complete.

Jay
See also for Merge Sort on an example algorithm to handle data too large to fit into memory.
Thomas Matthews