Hello all :)
I am writing a piece of code designed to do some data compression on CLSID structures. I'm storing them as a compressed stream of 128 bit integers. However, the code in question has to be able to place invalid CLSIDs into the stream. In order to do this, I have left them as one big string. On disk, it would look something like this:
+--------------------------+-----------------+------------------------+
| | | |
| Length of Invalid String | Invalid String | Compressed Data Stream |
| | | |
+--------------------------+-----------------+------------------------+
To encode the length of the string, I need to output the 32 bit integer that is the length of the string one byte at a time. Here's my current code:
std::vector<BYTE> compressedBytes;
DWORD invalidLength = (DWORD) invalidClsids.length();
compressedBytes.push_back((BYTE) invalidLength & 0x000000FF);
compressedBytes.push_back((BYTE) (invalidLength >>= 8) & 0x000000FF));
compressedBytes.push_back((BYTE) (invalidLength >>= 8) & 0x000000FF));
compressedBytes.push_back((BYTE) (invalidLength >>= 8));
This code won't be called often, but there will need to be a similar structure in the decoding stage called many thousands of times. I'm curious if this is the most efficient method or if someone can come up with one better?
Thanks all!
Billy3
EDIT: After looking over some of the answers, I created this mini test program to see which was the fastest:
// temp.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <windows.h>
#include <ctime>
#include <iostream>
#include <vector>
void testAssignedShifts();
void testRawShifts();
void testUnion();
int _tmain(int argc, _TCHAR* argv[])
{
std::clock_t startTime = std::clock();
for (register unsigned __int32 forLoopTest = 0; forLoopTest < 0x008FFFFF; forLoopTest++)
{
testAssignedShifts();
}
std::clock_t assignedShiftsFinishedTime = std::clock();
for (register unsigned __int32 forLoopTest = 0; forLoopTest < 0x008FFFFF; forLoopTest++)
{
testRawShifts();
}
std::clock_t rawShiftsFinishedTime = std::clock();
for (register unsigned __int32 forLoopTest = 0; forLoopTest < 0x008FFFFF; forLoopTest++)
{
testUnion();
}
std::clock_t unionFinishedTime = std::clock();
std::printf(
"Execution time for assigned shifts: %08u clocks\n"
"Execution time for raw shifts: %08u clocks\n"
"Execution time for union: %08u clocks\n\n",
assignedShiftsFinishedTime - startTime,
rawShiftsFinishedTime - assignedShiftsFinishedTime,
unionFinishedTime - rawShiftsFinishedTime);
startTime = std::clock();
for (register unsigned __int32 forLoopTest = 0; forLoopTest < 0x008FFFFF; forLoopTest++)
{
testAssignedShifts();
}
assignedShiftsFinishedTime = std::clock();
for (register unsigned __int32 forLoopTest = 0; forLoopTest < 0x008FFFFF; forLoopTest++)
{
testRawShifts();
}
rawShiftsFinishedTime = std::clock();
for (register unsigned __int32 forLoopTest = 0; forLoopTest < 0x008FFFFF; forLoopTest++)
{
testUnion();
}
unionFinishedTime = std::clock();
std::printf(
"Execution time for assigned shifts: %08u clocks\n"
"Execution time for raw shifts: %08u clocks\n"
"Execution time for union: %08u clocks\n\n"
"Finished. Terminate!\n\n",
assignedShiftsFinishedTime - startTime,
rawShiftsFinishedTime - assignedShiftsFinishedTime,
unionFinishedTime - rawShiftsFinishedTime);
system("pause");
return 0;
}
void testAssignedShifts()
{
std::string invalidClsids("This is a test string");
std::vector<BYTE> compressedBytes;
DWORD invalidLength = (DWORD) invalidClsids.length();
compressedBytes.push_back((BYTE) invalidLength);
compressedBytes.push_back((BYTE) (invalidLength >>= 8));
compressedBytes.push_back((BYTE) (invalidLength >>= 8));
compressedBytes.push_back((BYTE) (invalidLength >>= 8));
}
void testRawShifts()
{
std::string invalidClsids("This is a test string");
std::vector<BYTE> compressedBytes;
DWORD invalidLength = (DWORD) invalidClsids.length();
compressedBytes.push_back((BYTE) invalidLength);
compressedBytes.push_back((BYTE) (invalidLength >> 8));
compressedBytes.push_back((BYTE) (invalidLength >> 16));
compressedBytes.push_back((BYTE) (invalidLength >> 24));
}
typedef union _choice
{
DWORD dwordVal;
BYTE bytes[4];
} choice;
void testUnion()
{
std::string invalidClsids("This is a test string");
std::vector<BYTE> compressedBytes;
choice invalidLength;
invalidLength.dwordVal = (DWORD) invalidClsids.length();
compressedBytes.push_back(invalidLength.bytes[0]);
compressedBytes.push_back(invalidLength.bytes[1]);
compressedBytes.push_back(invalidLength.bytes[2]);
compressedBytes.push_back(invalidLength.bytes[3]);
}
Running this a few times results in:
Execution time for assigned shifts: 00012484 clocks
Execution time for raw shifts: 00012578 clocks
Execution time for union: 00013172 clocks
Execution time for assigned shifts: 00012594 clocks
Execution time for raw shifts: 00013140 clocks
Execution time for union: 00012782 clocks
Execution time for assigned shifts: 00012500 clocks
Execution time for raw shifts: 00012515 clocks
Execution time for union: 00012531 clocks
Execution time for assigned shifts: 00012391 clocks
Execution time for raw shifts: 00012469 clocks
Execution time for union: 00012500 clocks
Execution time for assigned shifts: 00012500 clocks
Execution time for raw shifts: 00012562 clocks
Execution time for union: 00012422 clocks
Execution time for assigned shifts: 00012484 clocks
Execution time for raw shifts: 00012407 clocks
Execution time for union: 00012468 clocks
Looks to be about a tie between assigned shifts and union. Since I'm going to need the value later, union it is! Thanks!
Billy3