I have a library written in C and I have 2 applications written in C++ and C. This library is a communication library, so one of the API calls looks like this:
int source_send( source_t* source, const char* data );
In the C app the code does something like this:
source_t* source = source_create();
for( int i = 0; i < count; ++i )
source_send( source, "test" );
Where as the C++ app does this:
struct Source
{
Source()
{
_source = source_create();
}
bool send( const std::string& data )
{
source_send( _source, data.c_str() );
}
source_t* _source;
};
int main()
{
Source* source = new Source();
for( int i = 0; i < count; ++i )
source->send( "test" );
}
On a Intel Core i7 the C++ code produces almost exactly 50% more messages per second.. Whereas on a Intel Core 2 Duo it produces almost exactly the same amount of messages per second. ( The core i7 has 4 cores with 2 processing threads each )
I am curious what kind of magic the hardware performs to pull this off. I have some theories but I thought I would get a real answer :)
Edit: Additional information from comments
Compiler is visual C++, so this is a windows box (both of them)
The implementation of the communication library creates a new thread to send messages on. The source_create is what creates this thread.