views:

1436

answers:

4

I was doing a quick performance test on a block of code

void ConvertToFloat( const std::vector< short >& audioBlock, 
                     std::vector< float >& out )
{
    const float rcpShortMax = 1.0f / (float)SHRT_MAX;
    out.resize( audioBlock.size() );
    for( size_t i = 0; i < audioBlock.size(); i++ )
    {
     out[i] = (float)audioBlock[i] * rcpShortMax;
    }
}

I was happy with the speed up over the original very naive implementation it takes just over 1 msec to process 65536 audio samples.

However just for fun I tried the following

void ConvertToFloat( const std::vector< short >& audioBlock, 
                     std::vector< float >& out )
{
    const float rcpShortMax = 1.0f / (float)SHRT_MAX;
    out.reserve( audioBlock.size() );
    for( size_t i = 0; i < audioBlock.size(); i++ )
    {
     out.push_back( (float)audioBlock[i] * rcpShortMax );
    }
}

Now I fully expected this to give exactly the same performance as the original code. However suddenly the loop is now taking 900usec (i.e. it's 100usec faster than the other implementation).

Can anyone explain why this would give better performance? Does resize() initialize the newly allocated vector where reserve just allocates but does not construct? This is the only thing I can think of.

PS this was tested on a single core 2Ghz AMD Turion 64 ML-37.

+9  A: 

Does resize initialize the newly allocated vector where reserve just allocates but does not construct?

Yes.

sepp2k
SGI's STL reference explains that resize "inserts or erases elements at the end", while reserve just does the memory allocation. http://www.sgi.com/tech/stl/Vector.html
sixlettervariables
How does that work? malloc?
Eduardo León
It will use whatever the allocator is set to for the vector.
sixlettervariables
If you benchmark after the resize/reserve call you can see if this is the reason.
Laserallan
@Eduardo - this works using the Allocator for the vector (which you usually don't see becuase the default one 'just works' for most applications). Allocators have an interface that includes a function for allocating raw memory (`allocate()`) and a function for constructing an object in-place in that raw memory (`construct()`) - among other functions. `allocate()` might well be implemented by `malloc()`, but that's not a requirement. See Stephan T. Lavavej's article on "the Mallocator" for insight to how one might work: http://blogs.msdn.com/vcblog/archive/2008/08/28/the-mallocator.aspx
Michael Burr
DDJ also has a nice article by Matt Austern on Allocators: http://www.ddj.com/cpp/184403759
Michael Burr
+1  A: 

out.resize( audioBlock.size() );

Since out's size ( =0) is lesser than audioBlock.size() , additional elements are created and appended to the end of the out. This creates the new elements by calling their default constructor;

Reserve only allocates the memory.

aJ
+1  A: 

First code writes to out[i] which boils down to begin() + i (ie. an addition). Second code uses push_back, which probably writes immediately to a known pointer equivalent to end() (ie. no addition). You could probably make the first run as fast as the second by using iterators rather than integer indexing.

Edit: also to clarify some other comments: the vector contains floats, and constructing a float is effectively a no-op (the same way declaring "float f;" does not emit code, only tells the compiler to save room for a float on the stack). So I think that any performance difference between resize() and reserve() for a vector of floats is not to do with construction.

AshleysBrain
Sorry but your construction point is untrue. float f = 0.0f; is obviously slower than just "float f;". the latter IS a nop the former is not.
Goz
Oh, fair point, didn't know constructing a float assigned it 0. Vector assigns T() to each element when resizing, which is float(), which is 0. Still, using iterators instead of integer indexing might be faster.
AshleysBrain
A: 

Resize() Modifies the container so that it has exactly n elements, inserting elements at the end or erasing elements from the end if necessary. If any elements are inserted, they are copies of t. If n > a.size(), this expression is equivalent to a.insert(a.end(), n - size(), t). If n < a.size(), it is equivalent to a.erase(a.begin() + n, a.end()).

Reserve()

If n is less than or equal to capacity(), this call has no effect. Otherwise, it is a request for allocation of additional memory. If the request is successful, then capacity() is greater than or equal to n; otherwise, capacity() is unchanged. In either case, size() is unchanged.

Memory will be reallocated automatically if more than capacity() - size() elements are inserted into the vector. Reallocation does not change size(), nor does it change the values of any elements of the vector. It does, however, increase capacity()

Reserve causes a reallocation manually. The main reason for using reserve() is efficiency: if you know the capacity to which your vector must eventually grow, then it is usually more efficient to allocate that memory all at once rather than relying on the automatic reallocation scheme.

sat