views:

554

answers:

8

Hi, I am dealing with a lot of strings in my program. These string data don't change through out the whole life time after they being read into my program.

But since the C++ string reserves capacity, they waste a lot of space that won't be used for sure. I tried to release those spaces, but it didn't work.

The following is the simple code that I tried:

string temp = "1234567890123456";
string str;

cout << str.capacity() << endl;   

str.reserve(16);    
cout << str.capacity() << endl;  
// capacity is 31 on my computer    

str += temp;    
cout << str.capacity() << endl;    

str.reserve(16);    
cout << str.capacity() << endl;  
// can't release. The capacity is still 31.

(The compiler is Visual C++)

How could I release it?

+4  A: 

Why don't you use a char array then?

JRL
+1. A char array of the library can be turned in a std::string when you wish to use it and destroyed later.
GogaRieger
Assuming you mean dynamic allocation, even that is likely to "waste" at least 4 bytes to hold the length, and probably the internal dynamic allocator is likely to allocate in blocks no smaller than 16/32 bytes for efficiency reasons.
j_random_hacker
+11  A: 

When you call reserve, you're making a request to change the capacity. Implementations will only guarantee that a number equal to or greater than this amount is reserved. Therefore, a request to shrink capacity may be safely ignored by a particular implementation.

However, I encourage you to consider whether this isn't premature optimization. Are you sure that you're really making so many strings that it's a memory bottleneck for you? Are you sure that it's actually memory that's the bottleneck?

From the documentation for reserve:

This can expand or shrink the size of the storage space in the string, although notice that the resulting capacity after a call to this function is not necessarily equal to res_arg but can be either equal or greater than res_arg, therefore shrinking requests may or may not produce an actual reduction of the allocated space in a particular library implementation. In any case, it never trims the string content (for that purposes, see resize or clear, which modify the content).

John Feminella
+1 sounds like a premature optimization to me.
Steve Rowe
+2  A: 

I think you can use swap method to free the data. swap it with a empty local string so that when the local string goes out of scope the memory is freed.

Naveen
+1. This is a good (though not guaranteed) way to return memory if you want to set the string to something else. Although it won't help trimming space if the underlying allocator rounds this up to the nearest 16 bytes (as MSVC++ seems to do), nothing will help that.
j_random_hacker
A: 

There is no guaranteed minimum capacity for std::string. You can request whatever capacity you want by calling reserve but an particular implementation only guarantees to set capacity to some amount greater than or equal to the requested size.

Here's a modified version of your program which tests several methods of string shrinking:

#include <string>
#include <iostream>
using namespace ::std;

template< typename S >
S & reserve_request( S & s, typename S::size_type n ) {
    s.reserve( n ); return s;
}

template< typename S >
S & shrink_request1( S & s ) { s.reserve(); return s; }

template< typename S >
S & shrink_request2( S & s ) { S( s ).swap( s ); return s; }

template< typename S >
S & shrink_request3( S & s ) { S( s.c_str() ).swap( s ); return s; }

template< typename S >
void test( S & s ) { cout << s.capacity() << endl; }

int main() {
    string temp = "1234567890123456";    // length 16
    string str;

    test( str );                         // 15
    test( reserve_request( str, 16 ) );  // 31
    test( str += temp );                 // 31
    test( reserve_request( str, 16 ) );  // 31
    test( shrink_request1( str ) );      // 31
    test( shrink_request2( str ) );      // 31
    test( shrink_request3( str ) );      // 31
    return 0;
}

It would appear that Visual C++'s std::string typically keeps some spare capacity.

If your project loads large numbers of strings read in from an external source whose size then never changes, you might be better off (as others have suggested) storing them in a single big block of character memory separated by '\0' characters (i.e., as C-strings). If you like, you could provide wrapper functions that return std::strings on the fly.

jwfearn
+4  A: 

Spelling out Naveen's answer:

string x = "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz";
cerr << x.capacity() << "\n";    // MSVC++: 63    g++: 52

// This tends not to work (although in theory it could):
//x = "XYZ";
//cerr << x.capacity() << "\n";  // MSVC++: 63    g++: 52

// This tends to work (although in theory it might not):
string("XYZ").swap(x);
cerr << x.capacity() << "\n";    // MSVC++: 15    g++: 3

Note that if the underlying allocator allocates more than n bytes when constructing a string of length n (e.g. by rounding up to the nearest 32 as MSVC++ appears to do), there's no way to make it use fewer bytes. But you probably wouldn't want to do that anyway, as this "rounding up" is done to make the dynamic memory allocation process more efficient, and also has the side effect of making concatenation of short strings faster on average (since fewer reallocations need to occur).

j_random_hacker
+1  A: 

This is mostly implementation specific. The idea is to minimize allocation requests and memory fragmentation. It's easy to prove that by doubling the existing size every time the block is expanded, both allocation count and memory fragmentation are minimized. Therefore typically STL container implementations will double the existing block when expanding.

One thing you can do is use a custom allocator that will not allocate more than necessary, next, construct your std::string objects when you no longer need to manipulate them (or when done manipulating, just swap into a new std::sting object - this is basically what others have done in their answers) and finally, you can use a pooled memory allocator to minimize memory fragmentation, wasted slack and improve performance.

See:

http://www.codeguru.com/cpp/cpp/cpp_mfc/stl/article.php/c4079 http://www.sjbrown.co.uk/2004/05/01/pooled-allocators-for-the-stl/ http://www.codeproject.com/KB/stl/blockallocator.aspx

Search for "STL Allocator" and "Memory Pool"

Ash
+1  A: 

Try the std::string swap-trick to shrink your strings:

std::string( str.data(), str.size() ).swap( str )

Where str is the string you want to cut down to size.

ceretullis
A: 

capacity will NEVER be less than 15 with dinkumware STL. std::basic_string has a union that is the pointer to an allocated buffer or a 16 byte buffer (if capacity()<=15) (for char strings)

see the xstring header file

in the example you give where you reserve 16 you are actually reserving 17 (one for null) which is > 16 so is allocated rather than cached in the 16 byte in the cache pointer union. That allocation doubles the previous size (16) so you get 32. The capacity of that string is then probably 31.

But this is STL implementation dependent.

Changing the allocator template parameter in a template decl of std::basic_string is NOT enough - the choice of when to allocate and how much, is in std::basic_string grow algorithm NOT in the allocator. The doubling of the previous size (and shrinking when < 1/4) is standard stuff in Introduction to Algorithms - Cormen Lieserson Rivest Stein

Not sure about the shrinking algo in dinkumware.......