views:

240

answers:

5

I know that all arrays in .net are limited to 2 GB, under this premise, I try not to allocate more that n = ((2^31) - 1) / 8 doubles in an array. Nevertheless, that number of elements still doesn't seem to be valid. Anyone knows how can I determine at run time the maximum number of elements given sizeof(T)?

I know that whatever quantity approaching that number is just a lot of elements but, for all intents and purposes, let's say I need it.

Note: I'm in a 64-bit environment, with a target platform for my application of AnyCPU, and at least 3100 MB free in RAM.

Update: Thank you all for your contributions and sorry I was so quiet. I apologise for the inconvenience. I have not been able to rephrase my question but I can add that, what I am looking for is solving something like this:

template <class T>
array<T>^ allocateAnUsableArrayWithTheMostElementsPossible(){
    return gcnew array<T>( ... );
}

The results in my own answer are kinda satisfactory but not good enough. Furthermore, I haven't test it on another machine (Kind of hard finding another machine with more than 4 GB). Besides, I have been doing some research on my own and it seems there's no cheap way to calculate this at run time. Anyhow, that was just a plus, none of the user of what-I-am-trying-to-accomplish can expect to use the feature I am trying to implement without having the capacity.

So, in other words, I just want to understand why the maximum number of elements of an array don't add up to 2GB ceteris paribus. A top maximum is all I need for now.

A: 

Your process space is limited to 2GB unless you're [compiled anycpu or x64] and running in an x64 process [on an x64 machine]. This is what you're probably actually running into. Calculating the headroom you have in the process is not an exact science by any means.

(Nitpickers corner: There is a /3GB switch and stacks of other edge cases which impact this. Also, the process needs to have virtual or physical space to be allocated into too. The point is that at the present time, most people will more often run into the OS per process limit than any .NET limit)

Ruben Bartelink
This was obviously a much more relevant answer prior to you editing in you x64 enviroment (I take it you've checked you're defiinitely being loaded as x64?)
Ruben Bartelink
@Ruben: As far as I'm aware, the CLR has a 2GB max object size limit regardless of whether you're running in 32-bit or 64-bit. Even if the OP has 3GB of free RAM, I'm not terribly surprised that they can't find 2GB of contiguous space to allocate an array.
LukeH
@Luke: My point was exactly that (and this was before x64 was mentioned by the OP) - the chances of there being 2GB free in a process is low (I was assuming a simple test and hence contiguity within the CLR process's Large Object Heap not being a concern) given there is only a max of 2GB of addressable space before putting in OS overhead, thread stacks and the CLR's usage. But in the question, he says he's x64 so assuming its an x64 process and there is free [phys + virtual] memory to meet the need, we're down to CLR constraints. My points were only relevant to x86.
Ruben Bartelink
Even with a 3 GB process, btw, your largest contiguous block is still something under 2 GB. Reason being, the libraries that are normally loaded to the top of the 2 GB space are still loaded there in 3 GB processes, thereby partitioning it into a chunk somewhat under 2 GB, and a chunk of about 1 GB.
DrPizza
@DrPizza: Nice, I didnt know that (and am glad I didnt!) I'm glad I put in the nitpickers corner though!
Ruben Bartelink
@Ruben: maybe not an exact science, but you can determine it for any particular situation rather easily, as I found out while experimenting. See my new answer for a code example of the "scientific" approach.
Abel
@Abel: My post is for x86 contexts. In x86 contexts, the limiting factor (assuming the .NET limit is >1.999 GB) is going to be the max contiguous block available to allocate, which is largely influenced by the headroom available in the process address space. This depends on what's used at any time. If you start a new thread, boom goes 1MB for the stack. If something else like say a IO completion port thread does something... If a GC happens, things might change.
Ruben Bartelink
What you're talking about is different. The questioner will not be able to say var myDoubles = new double[ MyMagicFunctionWhichAlwaysTellsMeHowMuchICanAllocate()]. Even if the right value was computed, there's a race condition in that the available space might be used up. Does this make any sense in terms of explaining my angle when saying that?
Ruben Bartelink
I see now what you're getting at. My (new) post, which you also commented, shows that you're better of using jagged arrays. You are not limited to a contiguous block of memory. Finding the maximum contiguous block of memory available is possible with Win32 API functions and rather trivial, but rather useless in the light of .NET. If you want to pass the 2GB limit, you need to split into jagged (or similar) arrays. You can use up all memory, far above the 2GB limit. With a contiguous block, this is hardly ever possible.
Abel
About the race condition: there's always a chance that others use up memory before you (re)claim it. My suggestion to the asker: let your array grow dynamically (see my answer) and don't GC at the OOM but just remove a few elements to have some room to play.
Abel
@Abel: Yes, jagged array cope with [relatively] low memory better and your example is interesting. But that's not the asker's context, he's on x64 and wants to understand exactly waht the .NET limit is *assuming he is on x64 and available process VM is not the bottleneck*. But I might be wrong. I'm sure @AZ will be stepping in with comments and votes any time now (his answer is the best IMO - its the only one anyone has voted for either!)
Ruben Bartelink
Missed that answer before. I don't see how it is related, but hopefully AZ comes around sometime/day to explain. If it is really that theoretical, the answer to the question becomes totally trivial, really, since we have to assume the 2GB object limit, take the object overhead, find the .NET internal mem alignment, and you're about done. But he mentions "at runtime", so theoretical limits are of little value then. Anyway, I think I'm totally lost at what the actual intend of the question is.
Abel
A: 

You also need to add the pointer size (System.IntPtr.Size) to each sizeof(T) to account for the pointer to the object in any given array element.

x0n
Not sure what you're driving at. If its a Value Type (OP's talking doubles), there isnt such an overhead on a per-item basis - though there's a .Length and other small bits of bookkeeping baggage for the array, but none of that is on a per-item basis. If its a ref type, then you're storing units of `IntPtr.Size`. Can you clarify?
Ruben Bartelink
-1 as nobody home to answer concern.
Ruben Bartelink
Sorry, I wasn't around - I think I misunderstood the question. I didn't notice that the allocations were Doubles (value types).
x0n
A: 
Abel
If its a ref type, that makes sense. Questioner has doubles, which are value types and get inlined into the array object
Ruben Bartelink
And even with references, you'll still be limited by the actual size taken up by the array - either 4*count or 8*count depending on the CLR.
Jon Skeet
@Abel: The size of the elements *does* limit the maximum number of elements, at least in theory. Although the current max number of elements is limited to `int.MaxValue`, you're also limited by .NET's 2GB max object size limit. So *theoretically* an array of `bool` or `byte` could have somewhere close to `2^31` elements before it hits the 2GB limit, whereas an array of `long` could only have a theoretical max of roughly `2^28` elements before it approaches 2GB.
LukeH
Agreed, you are. The question says "2GB is limit for .NET". This depends on the system. If the system is 32 bits, then yes. If it is 64 bits then no. And... when UAE (term) is enabled, more memory may be availalable, but I'm uncertain whether .NET supports UAE.
Abel
@Luke: is that really correct? I thought the object size limit is dependent on CLR and bitness of the system.
Abel
As far as I'm aware the max object size in .NET is restricted to 2GB. This was certainly the case in the past and I'm not aware of any recent changes, although I'm no expert and it's difficult to find any definitive information.
LukeH
@Luke: this seems rather definitive: http://stackoverflow.com/questions/1087982/single-objects-still-limited-to-2-gb-in-size-in-clr-4-0/1088044#1088044 (added it to my answer as well)
Abel
On note 2: It might compile but I am thinking it might throw an overflow exception.
Anzurio
@AZ: it _might_ throw an overflow? The system with that much memory has yet to be invented! ;-)
Abel
@Luke: just a note, while reading on on the subject, I found that the 2GB object size limit *does not count for arrays of ref types*. You won't need too much magic to change arrays of doubles into an array of refs (while keeping speed or computability, suggestion: jagged arrays). See also my new answer.
Abel
@Abel: The 2GB limit still applies to arrays of ref types. Each object in the array can (theoretically) be 2GB in size, but the array object itself is also limited to 2GB, so the max number of elements is limited by the system's reference size. So on 32-bit an array of refs can have up to `2^29` elements, and up to `2^28` elements on 64-bit.
LukeH
@Luke: maybe we misunderstood each other. Indeed, the references themselves count towards the limit. Using jagged arrays, you can have unlimited space. The referenced object can have a certain size, this does not count towards the array limit. The `total size == array items * pointer size * object size` which will exceed the 2GB limit.
Abel
+1  A: 
Abel
The questioner is on x64 so this issue is *not* process address space headroom, it's .NET's limits after those issues have been removed.The questioner may or may not be comfortable with inducing GC into his workflow.
Ruben Bartelink
Well, the question title says *"[..]ACTUAL maximum [..] a .NET array [..] can be allocated?"*. With *actual*, I understand it is what you *can* use in your array, not what the system possibly has but that you cannot use. The code above gives, regardless the architecture, what **actually is available** to you. I don't see why that would not be an answer to the question.
Abel
Careful, I believe OutOfMemoryException's are uncatchable in future .Net versions.
sixlettervariables
Yes and no. OOMs are not always uncatchable. In this case, because you *reserve* the memory, but not *use* it, it is easy to recover. Future versions of .NET have no plans on disallowing catching OOM. The problem is of a different kind: an OOM is usually very hard to recover from. I don't say the code is a good idea in practice, but it answers the "How" of the question.
Abel
To the down-voters: I don't mind, but please be so cordial to explain what's the misinformation in my code.
Abel
Rewritten answer completely to reflect my new understanding of the question.
Abel
Thank you very much Abel! I'm accepting your answer. It really helped me a lot and hope it didn't create to much inconvenience! :)
Anzurio
No inconvenience whatsoever, was a fun ride to find out these nitty gritty details about arrays. Any additional thoughts on the 16 or 24 bytes we couldn't fully explain? Have you tried Memory View of an array while debugging on 64 bits (just type the variable name in the memory view window)?
Abel
+1  A: 

So, I ran a li'l program to find out some hard values and this is what I found:

  • Given a type T, f(sizeof(T)) = N + d

    • Where f is the real maximum size of an array of Ts.
    • N is the theoretical maximum size, that is: Int32::MaxValue / sizeof(T)
    • And d, is the difference between N and f(x).

Results:

  • f(1) = N - 56
  • f(2) = N - 28
  • f(4) = N - 14
  • f(8) = N - 7
  • f(16) = N -3
  • f(32) = N - 1

I can see that everytime the size dups, the difference between the real size and the theoretical size folds but not in powers of 2. Any ideas why?

Edit: d is amount of type T elements. To find d in bytes, do sizeof(T) * d.

Anzurio
+1: Nice work. No guess on why :D
Ruben Bartelink
Your li'l program sounds somewhat interesting (though OT), but even more if we can see it. *About difference:* `Int32.MaxValue` is an odd number. If `32` is the size in bytes of a value type in the last example, then `d(32) == Int32.MaxValue - (Int32.MaxValue / 32) * 32` which yields `31`, a power of two minus one for `d`. Btw, I have a hard time understanding how this is related, you asked for the *actual* max in your question, while your own answer is highly theoretical (and false to some extend, because it is implementation dependent: Mono (or even Micro?) have other maxes).
Abel
@abel, what's OT? I don't understand why this is not related and highly theoretical.I ask for the actual maximum "number of elements" (sorry that I quote this but in your comments you twice omitted it). I don't mean to be rude nor anything but, some of your comments kinda depicts me like a "duh questioner" which is not appreciated, or at least, so I feel.
Anzurio
@AZ, I very much apologize if you feel like that. I love this question, would I otherwise spend so much time? OT means "off topic", which is, of course, my opinion (not being native, I sometimes sound harsh, sorry). Either I didn't understand your code or I didn't understand your question, I wrote "OT" because I thought this was a side-track (an interesting one nonetheless). I figured you need to know how many bytes are available before you can divide that by the size of your elements, to find the total. I didn't see that in your code. Still, I'm wondering how to interpret it. What did I miss?
Abel
@Abel, I'm thrilled you liked my question. But in general, I think my question is simpler than you think. Say I have enough physical memory available (Right now 4 GB free), that'd make me think that I can easily allocate 2 GB in an 1D array, nevertheless, if I do something like new double[Int32::MaxValue / 8], it will throw a memory exception but (Int32::MaxValue / 8) - d, won't for d >= 7. That's 56 bytes shy of 2 GB.
Anzurio
And another note, just for sanity, I created two arrays of ((Int32::MaxValue / 8) - 7) doubles just to make sure that, when I tried the first time, I didn't run out of physical memory. Given that I could have these two arrays at the same time, made me think that the physical memory wasn't an issue :).
Anzurio
@AZ: you are absolutely right, I was totally on the wrong track. And I misunderstood `d`, which I took as amount of bytes, not `bytes * sizeof(T)`. I can explain both 56 bytes and 32 bytes cap, but not entirely satisfactorily. I'll need your help with the last details ;-). See my (again updated) answer.
Abel
Altogether I haven't come up with a magic function for you, but I've tried to explain how you could find the gap-size, which can help in building this function. Obviously, the conclusions may be different for different CLR functions, even between two updates of the same version. I tested everything with .NET CLR 3.5 SP1.
Abel