views:

634

answers:

11

I'm currently doing something like this in some code I'm working on right now:

public CommandType GetCommandTypeFromCommandString(String command)
{
   if(command.StartsWith(CommandConstants.Acknowledge))
      return CommandType.Acknowledge;
   else if (command.StartsWith(CommandConstants.Status))
      return CommandType.Status;
   else if (command.StartsWith(CommandConstants.Echo))
      return CommandType.Echo;
   else if (command.StartsWith(CommandConstants.Warning))
     return CommandType.Warning;
     // and so on

   return CommandType.None;
}

I'd like to know if there is a more efficient way of doing this in C#. This code needs to execute many, many times a second, and I'm not too happy with the time it takes to do all of those string comparisons. Any suggestions? :)

+2  A: 

It's quite a bit of work, but you could construct an FSM based on each of the tokens. The FSM would accept characters from the command string one by one; it would have one final state for each token and an additional final state to which to go when the command doesn't start with any of the tokens.

Vojislav Stojkovic
+1  A: 

Create an

IDictionary<string,CommandType>

and populate it with the values.

You don't need to compare everything...just look it up in the table.

You'll also need to define your command syntax a little better. Require a space between the command and the rest of the line for example...

Jason Punyon
I don't really understand how this answer helps solve the problem of detecting strings that start with the command token.
mquander
The problem is that I get command strings like this: ER099. I can't do a lookup with something like that.
unforgiven3
What is the syntax of that command though? How do you define a bad command (other than it doesn't start with any of the command strings)
Jason Punyon
If it doesn't start with a command string, it's a bad command. The protocol I'm working with seems like it was thrown together quickly and without much thought, so there isn't a good syntax that I can rely on.
unforgiven3
Short of writing a parser, which I really don't want to do - I would consider that overkill in this case.
unforgiven3
Are there any commands that contain other commands (like ER, ERR ?)
Jason Punyon
Yes there are - KV and KVS is a good example.
unforgiven3
@unforgiven3: If this is the case, then regular expressions comes into its own. For instance, if your string starts with ER or ERR followed by some numeric, then @"^ERR?\d+(?=\s)" could match any string starting with that token. This could then be mapped to CommandType.Error as an example...
BenAlabaster
+1  A: 

How many times is "many many"? I seriously doubt this part of the code is the bottleneck. You could probably optimize it a tiny tiny little bit with a switch statement based on the first letter of each command, assuming they are all different.

But then again, it is really useful? I wouldn't bet much on it.

Serge - appTranslator
+6  A: 

Similar in concept to the FSM answer by Vojislav, you could try putting the constants into a trie. You can then replace your sequence of comparisons with a single traversal of the trie.

There is a C# trie implementation described here.

Mike Houston
Thanks! That may be what I end up doing - I like this approach a lot. I forgot about tries!
unforgiven3
I like this approach too.
Dominic Cooney
Thanks, I forgot about tries!
Vojislav Stojkovic
A: 

NOTE: I'm not demonstrating using Exception handling to control program flow. Enum.Parse will throw an exception if the string name of the Enum doesn't exist. The Catch clause just returns the default CommandType of None like the questioner's sample code does.

If the object is just to return the actual Enum object given the string name couldn't you use:

try
{
    return (CommandType)Enum.Parse(typeof(CommandType), strCmdName, true);
}
catch (Exception)
{
    return CommandType.None;
}
Using exceptions to control program flow is a very bad idea.
unforgiven3
I'm not controlling program flow. Enum.Parse throws an exception if the given string value can't be mapped to an existing Enum. The Catch just returns a default Command of None - just as the questioner's sample code does. I think the down-vote was uncalled for.
I forgot, Enum has no TryParse method. I take it back, undo my downvote, and apologize. :)
unforgiven3
Thanks "unforgiven3". Most people wouldn't have the courtesy to 'fess up and fix an oversight. It is much appreciated.
I'm not sure I'd downvote it, but It's still not a good answer: it doesn't address the primary question, which is not "how do I convert a string into an enum?" but "how do I convert a string into an enum more efficiently than the method I'm using now?"
Robert Rossney
Robert - While I admit I didn't do benchmark testing my thought was that one line of code HAD to be more efficient than walking an if / if else tree. I take your point though that I should have tested it.
+2  A: 

I think you should go for more readability than worry about efficiency. These operations are pretty fast. I second Serge, it's unlikely this part of the code is the bottleneck. I would do something like this:

public CommandType GetCommandTypeFromCommandString(String command)
{
   for(String currentCommand : allCommands) {
       if(command.StartsWith(currentCommand))
           return currentCommand;
   }
   return CommandType.None;
}

EDIT: As an afterthought, if you know which commands are used most frequently, you could order your array so those commands are at the start... you can also do this with if statements if you keep them.

armandino
+5  A: 

One optimization would be to use the StringComparison enum to specify that you only want ordinal comparison. Like this:

if(command.StartsWith(CommandConstants.Acknowledge, StringComparison.Ordinal))
    return CommandType.Acknowledge;

If you don't specify a string comparison method the current culture will be used for comparison and that slows things down a bit.

I did some (really really naive) benchmarking:

var a = "foo bar foo";
var b = "foo";

int numTimes = 1000000;

Benchmark.Time(() => a.StartsWith(b, StringComparison.Ordinal), "ordinal", numTimes);
Benchmark.Time(() => a.StartsWith(b), "culture sensitive", numTimes);

Which produced the following results:

ordinal ran 1000000 times in 35.6033 ms
culture sensitive ran 1000000 times in 175.5859 ms

You should also order your comparisons so that the most likely tokens are being compared first (happy path).

Theese optimizations are a simple way of making your current implementation perform better but if performance really is critical (and I mean really critical) you should be looking towards implementing some sort of state machine.

Markus Olsson
Wow - nice tip! Thanks!
unforgiven3
Great tip... you must have some kick ass machine, to get that to run in 35ms though... I thought my machine was good, but it runs your code over a million iterations in 10,000ms. In comparison, using a regex comes in around 6000ms.
BenAlabaster
Nope, no supercomputer over here... I put up some code for timing that you may try on your computer over at http://snipplr.com/view/11585/timing-different-variants-of-stringstartswith/
Markus Olsson
@Markus: Thanks, I'll check it out.
BenAlabaster
@Markus: Okay, it uses the same stopwatch I use for timing... I wonder why my timings were so far different than yours.
BenAlabaster
@Markus: Okay, I learned something new about the StopWatch from you - sw.Elapsed.TotalMilliseconds is not the same as sw.ElapsedMilliseconds... I never noticed there was another way of getting the elapsed time so thanks!
BenAlabaster
@balabaster: Yeah that's a dangerous caveat indeed! sw.Elapsed returns a simple timespan but you should always save a reference to that timespan instead of accessing it multiple times through the Stopwatch though since the Elapsed property generates a new timespan on every access.
Markus Olsson
@Markus: Okay, I take it back, after a whole bunch of tests, StartsWith in combination with StringComparison.Ordinal *is* the best performing in this situation, although potentially not as flexible as Regex. I will edit my answer...
BenAlabaster
+2  A: 

I think you could do better with a regular expression and a Dictionary:

static Regex reCommands = new Regex("^(cmd1|cmd2|cmd3|cmd4)", RegexOptions.Compiled);
static Dictionary<string, CommandType> Commands = new Dictionary<string, CommandType>();
private static InitDictionary()
{
    Commands.Add("cmd1", cmdType1);
    Commands.Add("cmd2", cmdType2);
    Commands.Add("cmd3", cmdType3);
    Commands.Add("cmd4", cmdType4);
}
public CommandType GetCommandTypeFromCommandString(String command)
{
    Match m = reCommands.Match(command);
    if (m.Success)
    {
        return Commands[m.Groups[1].Value];
    }
    return CommandType.None; // no command
}
Jim Mischel
+1  A: 

I did something similar as an extension method:

public static bool StartsWith(this string s, params string[] candidate)
{
    string match = candidate.FirstOrDefault(t => s.StartsWith(t));
    return match != default(string);
}

Now this is just a predicate that returns whether or not a string in an array starts with a given string, but you could modify it somewhat:

public static int PrefixIndex(this string s, params string[] candidate)
{
    int index = -1;
    string match = candidate.FirstOrDefault(t => { index++; return s.StartsWith(t); });
    return match == default(string) ? -1 : index;
}

and in usage it would be:

int index = command.PrefixIndex(tokenStrings);

if (index >= 0) {
    // convert to your enum
}

In a comment I saw that you wanted to do 30 string compares in 1/40th of a second. My friend, you should be able to do that on a 1 MHz machine. It should be no sweat to do thousands of string compares in 1/40th of a second.

plinth
+2  A: 

Edit: In light of a misunderstanding of one of the caveats of StopWatch, my original answer doesn't perform so well as StartsWith in combination with StringComparison.Ordinal. Even if you compile the regex with all the correct options it is a little slower, with performance comparable to using StartsWith without any StringComparison settings. However, the regular expression route does give you more flexibility to match patterns whereas StartsWith doesn't, so I've left my original answer for posterity...

Original Answer:

I have to admit, I'm not sure exactly what you're looking for - however, it seems to me that this kind of code is generally parsing a log file of some description looking to pull out useful information. To save you doing all the string comparisons long hand you could generate a regular expression of the values in your enumeration and then parse the correct enum item using the matched item:

public enum CommandType
{
    Acknowledge,
    Status,
    Echo,
    Warning
}
static public CommandType? GetCommandTypeFromString(String command)
{
    var CommandTypes = String.Join("|", Enum.GetNames(typeof(CommandType)));
    var PassedConstant = Regex.Match(command, String.Format("(?i:^({0}))", CommandTypes)).Value;
    if (PassedConstant != String.Empty)
        return (CommandType)Enum.Parse(typeof(CommandType), PassedConstant, true);
    return null;
}

static void Main(string[] args)
{
    Console.WriteLine(GetCommandTypeFromString("Acknowledge that I am great!").ToString());
}

This would pull out CommandType.Acknowledge from the beginning of my string, but only if it existed at the start of the string... it would also pull out the other correct CommandTypes.

Doing similar benchmarking to the accepted answer, I got about a 40% performance increase. I ran the accepted code over a million iterations in 10421ms, but mine ran in only 6459ms.

Of course, while the if statement you're using may not look as efficient as you'd like, it's still easier to read than using the regex form...

BenAlabaster
+1  A: 

A trie's almost certainly going to be the fastest approach. If the prefixes were the same length, you might be able to come up with a faster implementation by hashing the prefix to get an index into an array, but that breaks down if you don't know how many bytes to hash.

Robert Rossney