views:

47

answers:

2

I have a char * with a date string I wish to parse. In this case a very simple format: 2010-10-28T16:23:31.428226 (common ISO format).

I know with Boost I can parse this, but at a horrible cost. I have to create a string-stream, possibly a string, and then copy data back and forth. Is there any way to parse the char * without allocating any additional memory. Stack objects are fine, so is reusing a heap object.

Any easy way would also be great! ;)

Edit: I need the result in microseconds since the epoch.

+1  A: 

Sometimes plain old C is simpler. You can almost do it with strptime(...):

struct tm parts = {0};
strptime("2010-10-28T16:23:31", "%Y-%m-%dT%H:%M:%S", &parts);

Unfortunately, you'd have to grab the fractional seconds separately. I suppose you could do it with sscanf(...) too:

unsigned int year, month, day, hour, min;
double sec;
int got = sscanf(
     "2010-10-28T16:23:31.428226", 
     "%u-%u-%uT%u:%u:%lf",
     &year, &month, &day, &hour, &min, &sec
);
assert(got == 6);
xscott
Oops, should have mentioned I want the result in microseconds since the epoch... though I can just call mktime then.
edA-qa mort-ora-y
Yeah, I think you'd want to let strptime(...) do most of the work for you then, and pull off the floating point seconds with atof(...) or similar. Also beware that the date routines can be affected by the TZ environment variable under Unix. I assume there is a similar timezone thing under Windows. I "setenv TZ UTC" to avoid this problem.
xscott
The TZ aspect is quite annoying, I can't control it from within the code. I've gone back to boost date_time and will continue: for now it appears I can cache the stream and several objects involved. I'll have to do a performance comparison of these two methods later.
edA-qa mort-ora-y
A: 

If it's a fixed format, why don't you take advantage of the positional information?

for example, you know always that the year is the first four characters, so:

const char* date = "2010-10-28T16:23:31.428226";
int year = (((date[0] ^ 0x30)) * 1000) + ((date[1] ^ 0x30) * 100) + ((date[2] ^ 0x30) * 10) + (date[3] ^ 0x30);
int month = ((date[5] ^ 0x30) * 10) + (date[6] ^ 0x30);
int day = ((date[8] ^ 0x30) * 10) + (date[9] ^ 0x30);

etc.

the microsecond segment is a little trickier, depending on whether it's 0 padded or not... but you know the format, so I should assume it's trivial...

significantly faster than any library routine, of course it's very fragile, but if you can control input, why not?

Nim
That xor 0x30 thing is clever. I hadn't seen that before. I notice that you stopped short of using it to expand the seconds field though. :-)
xscott