views:

1805

answers:

11

Through profiling I've discovered that the sprintf here takes a long time. Is there a better performing alternative that still handles the leading zeros in the y/m/d h/m/s fields?

SYSTEMTIME sysTime;
GetLocalTime( &sysTime );
char buf[80];
for (int i = 0; i < 100000; i++)
{

 sprintf(buf, "%4d-%02d-%02d %02d:%02d:%02d",
  sysTime.wYear, sysTime.wMonth, sysTime.wDay, 
  sysTime.wHour, sysTime.wMinute, sysTime.wSecond);

}


Note: The OP explains in the comments that this is a stripped-down example. The "real" loop contains additional code that uses varying time values from a database. Profiling has pinpointed sprintf() as the offender.

A: 

StringStream is the suggestion that I got from Google.

http://bytes.com/forum/thread132583.html

Jason Lepack
Why do you think that will be faster?
Roddy
+5  A: 

You could try filling each char in the output in turn.

buf[0] = (sysTime.wYear / 1000) % 10 + '0' ;
buf[1] = (sysTime.wYear / 100) % 10 + '0';
buf[2] = (sysTime.wYear / 10) % 10 + '0';
buf[3] = sysTime.wYear % 10 + '0';
buf[4] = '-';

... etc...

Not pretty, but you get the picture. If nothing else, it may help explain why sprintf isn't going to be that fast.

OTOH, maybe you could cache the last result. That way you'd only need to generate one every second.

Roddy
In real life, the time value is different every iteration of the loop (from a db). I just simplified the code for the example,
Corey Trager
+5  A: 

Printf needs to deal with a lot of different formats. You certainly could grab the source for printf and use it as a basis to roll your own version that deals specifically with the sysTime structure. That way you pass in one argument, and it does just exactly the work that needs to be done and nothing more.

EvilTeach
+1  A: 

You would probably get w perf increase by hand rolling a routine that lays out the digits in the return buf, since you could avoid repeatedly parsing a format string and would not have to deal with a lot of the more complex cases sprintf handles. I am loathe to actually recommend doing that though.

I would recommend trying to figure out if you can somehow reduce the amount you need to generate these strings, are they optional somegtimes, can they be cached, etc.

Louis Gerbarg
A: 

It's hard to imagine that you're going to beat sprintf at formatting integers. Are you sure sprintf is your problem?

Ned Batchelder
It's not formatting integers that sprintf takes its time on, it's parsing the format string. Since it doesn't change in the loop, parsing it each time around is unnecessary.
Steve Jessop
+2  A: 

What do you mean by a "long" time -- since the sprintf() is the only statement in your loop and the "plumbing" of the loop (increment, comparison) is negligible, the sprintf() has to consume the most time.

Remember the old joke about the man who lost his wedding ring on 3rd Street one night, but looked for it on 5th because the light was brighter there? You've built an example that's designed to "prove" your assumption that sprintf() is ineffecient.

Your results will be more accurate if you profile "actual" code that contains sprintf() in addition to all the other functions and algorithms you use. Alternatively, try writing your own version that addresses the specific zero-padded numeric conversion that you require.

You may be surprised at the results.

Adam Liss
The code snippet is a simplified example. In the real example, there are many things in the loop. (char*) _bstr_t, itoa.... It's the sprint here that's bad.
Corey Trager
Was it Einstein who said everything should be made as simple as possible, but no simpler? :-) Edited your question to reflect this.
Adam Liss
+1  A: 
Jaywalker
no. My code snippet is simplified from the real thing.
Corey Trager
+11  A: 

If you were writing your own function to do the job, a lookup table of the string values of 0 .. 61 would avoid having to do any arithmetic for everything apart from the year.

edit: Note that to cope with leap seconds (and to match strftime()) you should be able to print seconds values of 60 and 61.

char LeadingZeroIntegerValues[62][] = { "00", "01", "02", ... "59", "60", "61" };

Alternatively, how about strftime()? I've no idea how the performance compares (it could well just be calling sprintf()), but it's worth looking at (and it could be doing the above lookup itself).

therefromhere
+1 from an old embedded engineer. When time is more critical than size, it's hard to beat a lookup table!
Adam Liss
Nice. But be careful how you use this. Calling strcat() to append the strings is also likely to be ugly, performance-wise.
Roddy
Yeah, since you know they're all 2 characters you can just memcpy.Also, something that just occurred to me while looking at the strtime page, the array of strings should probably be up to "61", since strtime does that, presumably to account for leap seconds.
therefromhere
Actually, forget memcpy, you can just assign the characters.
therefromhere
I'd be inclined to call memcpy anyway - a decent compiler will replace the call with opcodes for the most efficient 2 byte copy. And if your compiler isn't decent, then trying to optimise code like this for speed is hopeless, you just have to write in assembler.
Steve Jessop
+1  A: 

I would do a few things...

  • cache the current time so you don't have to regenerate the timestamp every time
  • do the time conversion manually. The slowest part of the printf-family functions is the format-string parsing, and it's silly to be devoting cycles to that parsing on every loop execution.
  • try using 2-byte lookup tables for all conversions ({ "00", "01", "02", ..., "99" }). This is because you want to avoid moduluar arithmetic, and a 2-byte table means you only have to use one modulo, for the year.
Tom Barta
+1  A: 

Looks like Jaywalker is suggesting a very similar method (beat me by less than an hour).

In addition to the already suggested lookup table method (n2s[] array below), how about generating your format buffer so that the usual sprintf is less intensive? The code below will only have to fill in the minute and second every time through the loop unless the year/month/day/hour have changed. Obviously, if any of those have changed you do take another sprintf hit but overall it may not be more than what you are currently witnessing (when combined with the array lookup).


static char fbuf[80];
static SYSTEMTIME lastSysTime = {0, ..., 0};  // initialize to all zeros.

for (int i = 0; i < 100000; i++)
{
    if ((lastSysTime.wHour != sysTime.wHour)
    ||  (lastSysTime.wDay != sysTime.wDay)
    ||  (lastSysTime.wMonth != sysTime.wMonth)
    ||  (lastSysTime.wYear != sysTime.wYear))
    {
        sprintf(fbuf, "%4d-%02s-%02s %02s:%%02s:%%02s",
                sysTime.wYear, n2s[sysTime.wMonth],
                n2s[sysTime.wDay], n2s[sysTime.wHour]);

        lastSysTime.wHour = sysTime.wHour;
        lastSysTime.wDay = sysTime.wDay;
        lastSysTime.wMonth = sysTime.wMonth;
        lastSysTime.wYear = sysTime.wYear;
    }

    sprintf(buf, fbuf, n2s[sysTime.wMinute], n2s[sysTime.wSecond]);

}
shank
New to this... does my entire answer show up for anyone? To me it looks like only about 4 lines are visible.
shank
you need to use < for < and > for >
quinmars
Aah, of course. Thanks.
shank
A: 

I'm working on a similar problem at the moment.

I need to log debug statements with timestamp, filename, line number etc on an embedded system. We already have a logger in place but when I turn the knob to 'full logging', it eats all our proc cycles and puts our system in dire states, states that no computing device should ever have to experience.

Someone did say "You cannot measure/observe something without changing that which you are measuring/observing."

So I'm changing things to improve performance. The current state of things is that Im 2x faster than the original function call (the bottleneck in that logging system is not in the function call but in the log reader which is a separate executable, which I can discard if I write my own logging stack).

The interface I need to provide is something like- void log(int channel, char *filename, int lineno, format, ...). I need to append the channel name (which currently does a linear search within a list! For every single debug statement!) and timestamp including millisecond counter. Here are some of the things Im doing to make this faster-

  • Stringify channel name so I can strcpy rather than search the list. define macro LOG(channel, ...etc) as log(#channel, ...etc). You can use memcpy if you fix the length of the string by defining LOG(channel, ...) log("...."#channel - sizeof("...."#channel) + 11) to get fixed 10 byte channel lengths
  • Generate timestamp string a couple of times a second. You can use asctime or something. Then memcpy the fixed length string to every debug statement.
  • If you want to generate the timestamp string in real time then a look up table with assignment (not memcpy!) is perfect. But that works only for 2 digit numbers and maybe for the year.
  • What about three digits (milliseconds) and five digits (lineno)? I dont like itoa and I dont like the custom itoa (digit = ((value /= value) % 10)) either because divs and mods are slow. I wrote the functions below and later discovered that something similar is in the AMD optimization manual (in assembly) which gives me confidence that these are about the fastest C implementations.

    void itoa03(char *string, unsigned int value) { *string++ = '0' + ((value = value * 2684355) >> 28); *string++ = '0' + ((value = ((value & 0x0FFFFFFF)) * 10) >> 28); *string++ = '0' + ((value = ((value & 0x0FFFFFFF)) * 10) >> 28); string++ = ' ';/ null terminate here if thats what you need */ }

Similarly, for the line numbers,

void itoa05(char *string, unsigned int value)
{
   *string++ = ' ';
   *string++ = '0' + ((value = value * 26844 + 12) >> 28);
   *string++ = '0' + ((value = ((value & 0x0FFFFFFF)) * 10) >> 28);
   *string++ = '0' + ((value = ((value & 0x0FFFFFFF)) * 10) >> 28);
   *string++ = '0' + ((value = ((value & 0x0FFFFFFF)) * 10) >> 28);
   *string++ = '0' + ((value = ((value & 0x0FFFFFFF)) * 10) >> 28);
   *string++ = ' ';/* null terminate here if thats what you need */
}

Overall, my code is pretty fast now. The vsnprintf() I need to use takes about 91% of the time and the rest of my code takes only 9% (whereas the rest of the code i.e. except vsprintf() used to take 54% earlier)

Hope this helps somebody, Pranav Swaroop