views:

396

answers:

3

In my binary to text decoding application (.NET 2.0) I found that the line:

logEntryTime.ToString("dd.MM.yy HH:mm:ss:fff")

takes 33% of total processing time. Does anyone have any ideas on how to make it faster?

EDIT: This app is used to process some binary logs and it currently takes 15 hours to run. So 1/3 of this will be 5 hours.

EDIT: I am using NProf for profiling. App is processing around 17 GBytes of binary logs.

+5  A: 

It's unfortunate that .NET doesn't have a sort of "formatter" type which can parse a pattern and remember it.

If you're always using the same format, you might want to hand-craft a formatter to do exactly that. Something along the lines of:

public static string FormatDateTime(DateTime dt)
{
    char[] chars = new char[21];
    Write2Chars(chars, 0, dt.Day);
    chars[2] = '.';
    Write2Chars(chars, 3, dt.Month);
    chars[5] = '.';
    Write2Chars(chars, 6, dt.Year % 100);
    chars[8] = ' ';
    Write2Chars(chars, 9, dt.Hour);
    chars[11] = ' ';
    Write2Chars(chars, 12, dt.Minute);
    chars[14] = ' ';
    Write2Chars(chars, 15, dt.Second);
    chars[17] = ' ';
    Write2Chars(chars, 18, dt.Millisecond / 10);
    chars[20] = Digit(dt.Millisecond % 10);

    return new string(chars);
}

private static Write2Chars(char[] chars, int value, int offset)
{
    chars[offset] = Digit(value / 10);
    chars[offset+1] = Digit(value % 10);
}

private static char Digit(int value)
{
    return (char) (value + '0');
}

This is pretty ugly, but it's probably a lot more efficient... benchmark it, of course!

Jon Skeet
lol; I think we were working on some very similar code!
Marc Gravell
@Marc great minds think alike!!
DrG
... and fools rarely differ :)
Jon Skeet
+7  A: 

Are you sure it takes 33% of the time? How have you measured that? It sounds more than a little suspicious to me...

This makes things a little bit quicker:

Basic: 2342ms
Custom: 1319ms

Or if we cut out the IO (Stream.Null):

Basic: 2275ms
Custom: 839ms


using System.Diagnostics;
using System;
using System.IO;
static class Program
{
    static void Main()
    {
        DateTime when = DateTime.Now;
        const int LOOP = 1000000;

        Stopwatch basic = Stopwatch.StartNew();
        using (TextWriter tw = new StreamWriter("basic.txt"))
        {
            for (int i = 0; i < LOOP; i++)
            {
                tw.Write(when.ToString("dd.MM.yy HH:mm:ss:fff"));
            }
        }
        basic.Stop();
        Console.WriteLine("Basic: " + basic.ElapsedMilliseconds + "ms");

        char[] buffer = new char[100];
        Stopwatch custom = Stopwatch.StartNew();
        using (TextWriter tw = new StreamWriter("custom.txt"))
        {
            for (int i = 0; i < LOOP; i++)
            {
                WriteDateTime(tw, when, buffer);
            }
        }
        custom.Stop();
        Console.WriteLine("Custom: " + custom.ElapsedMilliseconds + "ms");
    }
    static void WriteDateTime(TextWriter output, DateTime when, char[] buffer)
    {
        buffer[2] = buffer[5] = '.';
        buffer[8] = ' ';
        buffer[11] = buffer[14] = buffer[17] = ':';
        Write2(buffer, when.Day, 0);
        Write2(buffer, when.Month, 3);
        Write2(buffer, when.Year % 100, 6);
        Write2(buffer, when.Hour, 9);
        Write2(buffer, when.Minute, 12);
        Write2(buffer, when.Second, 15);
        Write3(buffer, when.Millisecond, 18);
        output.Write(buffer, 0, 21);
    }
    static void Write2(char[] buffer, int value, int offset)
    {
        buffer[offset++] = (char)('0' + (value / 10));
        buffer[offset] = (char)('0' + (value % 10));
    }
    static void Write3(char[] buffer, int value, int offset)
    {
        buffer[offset++] = (char)('0' + (value / 100));
        buffer[offset++] = (char)('0' + ((value / 10) % 10));
        buffer[offset] = (char)('0' + (value % 10));
    }
}
Marc Gravell
I don't entirely disbelieve it - when I did some log tests I found that formatting and parsing dates and times dominated the CPU access. Admittedly it was doing virtually nothing but logging...
Jon Skeet
Your microbenchmark is IO bound, if you want to measure CPU cycles you probably should ditch the writing to a file.
Pop Catalin
Yes, but the OP is writing to a file, so it is a bit artificial to ignore this... but indeed, using Stream.Null would show more of the difference.
Marc Gravell
A: 

Do you know how big each record in the binary and text logs are going to be? If so you can split the processing of the log file across a number of threads which would give better use of a multi core/processor PC. If you don't mind the result being in separate files it would be a good idea to have one hard disk per core that way you will reduce the amount the disk heads have to move.

Martin Brown