I was doing a quick performance test on a block of code
void ConvertToFloat( const std::vector< short >& audioBlock,
std::vector< float >& out )
{
const float rcpShortMax = 1.0f / (float)SHRT_MAX;
out.resize( audioBlock.size() );
for( size_t i = 0; i < audioBlock.size(); i++ )
{
out[i] = (float)audioBlock[i] * rcpShortMax;
}
}
I was happy with the speed up over the original very naive implementation it takes just over 1 msec to process 65536 audio samples.
However just for fun I tried the following
void ConvertToFloat( const std::vector< short >& audioBlock,
std::vector< float >& out )
{
const float rcpShortMax = 1.0f / (float)SHRT_MAX;
out.reserve( audioBlock.size() );
for( size_t i = 0; i < audioBlock.size(); i++ )
{
out.push_back( (float)audioBlock[i] * rcpShortMax );
}
}
Now I fully expected this to give exactly the same performance as the original code. However suddenly the loop is now taking 900usec (i.e. it's 100usec faster than the other implementation).
Can anyone explain why this would give better performance? Does resize()
initialize the newly allocated vector where reserve just allocates but does not construct? This is the only thing I can think of.
PS this was tested on a single core 2Ghz AMD Turion 64 ML-37.