views:

243

answers:

3

Recently I've noticed that the following statement is not true given std::string s.

s.max_size() == s.get_allocator().max_size();

I find this interesting, by default std::string will use std::allocator<char> which has a theoretical limit of size_type(-1) (yes i know I'm assuming 2's complement, but that's unrelated to the actual question). I know that the practical limitations will be significantly less than this. On a typical 32-bit, x86 system, the kernel will occupy 2GB (perhaps 1GB) of the address space leaving a much smaller practical upper limit.

Anyway, GNU libstdc++'s std::basic_string<>::max_size() appears to return the same value regardless of what the allocator it is using says (something like 1073741820).

So the question remains, why doesn't std::basic_string<>::max_size() just return get_allocator().max_size()? It seems to me that this is the hypothetical upper limit. And if the allocation comes up short, it'll just throw a std::bad_alloc, so why not try?

This is more of a curiosity than anything else, I was just wondering why the two are defined separately in at least this one implementation.

+3  A: 

I am not entirely sure but as far as I know std::basic_string is not limited in the current standard to store the string in continuous memory. For example, it may store it in several chunks. Each such chunk is then limited to std::allocator::max_size() but the sum may be larger than that.

Also it seems to be the case with STL containers. And after all std::basic_string is a container.

Adam Badura
Except nobody probably implements it this way, the next standard might not allow it (not entirely sure), and the OP is seeing that it is significantly *smaller* than what the allocator allows.
UncleBens
+8  A: 

In Microsoft Connect was posted bug related to your question. Microsoft has interesting answer to it:

We've resolved it as By Design according to our interpretation of the Standard, which doesn't clearly explain what the intended purpose for max_size() is. Allocator max_size() is described as "the largest value that can meaningfully be passed to X::allocate()" (C++03 20.1.5 [lib.allocator.requirements]/Table 32), but container max_size() is described as "size() of the largest possible container" (23.1 [lib.container.requirements]/Table 65). Nothing describes whether or how container max_size() should be derived from allocator max_size(). Our implementation for many years has derived container max_size() directly from allocator max_size() and then used this value for overflow checks and so forth. Other interpretations of the Standard, such as yours, are possible, but aren't unambiguously correct to us. The Standard's wording could certainly benefit from clarification here. Unless and until that happens, we've decided to leave our current implementation unchanged for two reasons: (1) other customers may be depending on our current behavior, and (2) max_size() fundamentally doesn't buy anything. At most, things that consume allocators (like containers) could use allocator max_size() to predict when allocate() will fail - but simply calling allocate() is a better test, since the allocator will then decide to give out memory or not. Things that consume containers could use container max_size() as a guarantee of how large size() could be, but a simpler guarantee is size_type's range.

Additionally here you could find Core Issue #197. The committee has considered request to improve the wording of Standard, but it was declined.

So the answer to your question "Why..?" is that Standard doesn't clearly explain what the intended purpose for max_size() is.

Kirill V. Lyadvinsky
+1  A: 

GCC's implementation has a comment how they calculate max_size (one has to subtract the size of internal housekeeping object which is allocated as a single block with the string), and then adds that max_size() returns a quarter of that. There is no rationale given, so perhaps it is just a safety margin? (It should also provide a rope class, which perhaps one would use for so large strings?)

With VC++ max_size() returns one less than allocator.max_size() - probably to account for terminating null character.

UncleBens