views:

180

answers:

7

Hello, I hope some one can help me, i'm trying to create an int[400000000] (400 millions) array on my application using visual c++ 2010 but it generates an overflow error The same code runs on linux with g++. I need this because i'm working with large matrices. Thank you in advance.

+1  A: 

I'm not sure if in you're case it wouldn't even be better to use STXXL.

Klaim
+14  A: 

If you are using a 32-bit application then by default you have just 2GB of user address space. 400 million integers is about 1.5GB. You are very likely not to have this much contiguous address space. It is possible to force 32-bit windows to allocate a 3GB user address space for each process but this may just be a stop gap for your situation.

If you can move to a 64-bit architecture then this should not be an issue; otherwise you should find a way of storing your matrix data in a way that does not require a single block of contiguous storage, for example storing it in chunks.

Charles Bailey
+1, yup. 650 MB is about all you can expect. Storing a matrix in a smarter data structure is *very* important for speed as well.
Hans Passant
For speed it depends on your access patterns. For a straight linear pass through the data, the difference between the operating system paging out old pages and a manual caching strategy may be nothing. If (e.g.!) you're perfoming matrix multiplication on 20000 x 20000 matrices it's very difficult to come up with a fast strategy as you need to combine numbers from all areas of the matrices in all combinations.
Charles Bailey
@Charles: Not *that* difficult. Just need to use blocking :)
jalf
@jalf: Just so I'm clear, what do you mean by 'blocking' in this context?
Charles Bailey
@Charles: http://netlib.org/utk/papers/autoblock/node2.html for example -- basically split the matrix into smaller blocks (sized to fit in cache, or main memory), and do as much work on a single block as possible before swapping it out for the next one. A common way to get better cache behavior on algorithms such as matrix multiplication
jalf
@jalf: I must admit that I haven't ever had the need to implement a blocked algorithm for matrix multiplication but my impression was that you'd very quickly need to combine (i.e. add the results) of blocks generated by multiplying sub-blocks from across the address range so it's not obvious to me that significant speed gains are easy to obtain.
Charles Bailey
@Charles: I think we're veering off topic here, but it does dramatically cut down on the number of cache misses (or page faults or whatever you're trying to avoid). Assuming of course that those are what's limiting your performance, it's a very useful family of optimizations. I used variations of it on a few research projects as a student, and the speed gain can be significant, but it depends a lot on the hardware, and the size of your input and so on
jalf
A: 

Does the whole array really needs to be allocated ? do you really use the whole array ? Is it an array with lots of 0 ? if it is the case, then the fact that it works better on linux can be explained.

In that case using a sparse array might be more appropriate. Using an existing sparse array implementation would reduce the memory footprint and maybe allow faster computation.

BatchyX
+3  A: 

I think what you need is a Divide-and-Conquer algorithm. Not memory space.

Elroy
+1  A: 

Perhaps sparse matrices are of use in your application. This concept is used when dealing with big matrices which have a lot of 0 entries, which can be the case in quite a lot of applications.

And by the way, you do not gain anything by storing such a huge amount of data on the heap. Consider, that your CPU cache has perhaps 12 MB! At least use some intelligent dynamic memory allocation mechanism.

Danvil
A: 

Thank you very much for your answers! I'm using a 64-bit architecture with 12GB of Ram. Unfortunatly there is no way to reduce the matrix (not a sparce one) and i need to work on each element. I tried to allocate it dynamically and it's exactely the same if i create 20 separated arrays. It works well under linux on the same computer and i'm not familiar with visual. Is it possible to bypass the heap size limit with visual (i'm using express edition) ? I followed the MSDN's instruction to set heap reserve size and heap commit size to 2GB. But it seemed the setting didn't change anything at all.

Are you running a 64bit OS? are you compiling a 64bit binary?
morechilli
A: 
Your solution here is to grab the address space really early. Don't rely on this working for such a large block on 32bits it will be fragile to the environment.
morechilli