views:

1312

answers:

21

2009-12-04 UPDATE: For profiling results on a number of the suggestions posted here, see below!


The Question

Consider the following very harmless, very straightforward method, which uses a switch statement to return a defined enum value:

public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
    if (ActivCode == null) return MarketDataExchange.NONE;

    switch (ActivCode) {
        case "": return MarketDataExchange.NBBO;
        case "A": return MarketDataExchange.AMEX;
        case "B": return MarketDataExchange.BSE;
        case "BT": return MarketDataExchange.BATS;
        case "C": return MarketDataExchange.NSE;
        case "MW": return MarketDataExchange.CHX;
        case "N": return MarketDataExchange.NYSE;
        case "PA": return MarketDataExchange.ARCA;
        case "Q": return MarketDataExchange.NASDAQ;
        case "QD": return MarketDataExchange.NASDAQ_ADF;
        case "W": return MarketDataExchange.CBOE;
        case "X": return MarketDataExchange.PHLX;
        case "Y": return MarketDataExchange.DIRECTEDGE;
    }

    return MarketDataExchange.NONE;
}

My colleague and I batted around a few ideas today about how to actually make this method faster, and we came up with some interesting modifications that did in fact improve its performance rather significantly (proportionally speaking, of course). I'd be interested to know what sorts of optimizations anyone else out there can think up that might not have occurred to us.

Right off the bat, let me just offer a quick disclaimer: this is for fun, and not to fuel the whole "to optimize or not to optimize" debate. That said, if you count yourself among those who dogmatically believe "premature optimization is the root of all evil," just be aware that I work for a high-frequency trading firm, where everything needs to run absolutely as fast as possible--bottleneck or not. So, even though I'm posting this on SO for fun, it isn't just a huge waste of time, either.

One more quick note: I'm interested in two kinds of answers--those that assume every input will be a valid ActivCode (one of the strings in the switch statement above), and those that do not. I am almost certain that making the first assumption allows for further speed improvements; anyway, it did for us. But I know that improvements are possible either way.


The Results

Well, it turns out that the fastest solution so far (that I've tested) came from João Angelo, whose suggestion was actually very simple, yet extremely clever. The solution that my coworker and I had devised (after trying out several approaches, many of which were thought up here as well) came in second place; I was going to post it, but it turns out that Mark Ransom came up with the exact same idea, so just check out his answer!

Since I ran these tests, some other users have posted even newer ideas... I will test them in due time, when I have a few more minutes to spare.

I ran these tests on two different machines: my personal computer at home (a dual-core Athlon with 4 Gb RAM running Windows 7 64-bit) and my development machine at work (a dual-core Athlon with 2 Gb RAM running Windows XP SP3). Obviously, the times were different; however, the relative times, meaning, how each method compared to every other method, were the same. That is to say, the fastest was the fastest on both machines, etc.

Now for the results. (The times I'm posting below are from my home computer.)

But first, for reference--the original switch statement:
1000000 runs: 98.88 ms
Average: 0.09888 microsecond

Fastest optimizations so far:

  1. João Angelo's idea of assigning values to the enums based on the hash codes of the ActivCode strings and then directly casing ActivCode.GetHashCode() to MarketDataExchange:
    1000000 runs: 23.64 ms
    Average: 0.02364 microsecond
    Speed increase: 329.90%

  2. My colleague's and my idea of casting ActivCode[0] to an int and retrieving the appropriate MarketDataExchange from an array initialized on startup (this exact same idea was suggested by Mark Ransom):
    1000000 runs: 28.76 ms
    Average: 0.02876 microsecond
    Speed increase: 253.13%

  3. tster's idea of switching on the output of ActivCode.GetHashCode() instead of ActivCode:
    1000000 runs: 34.69 ms
    Average: 0.03469 microsecond
    Speed increase: 185.04%

  4. The idea, suggested by several users including Auraseer, tster, and kyoryu, of switching on ActivCode[0] instead of ActivCode:
    1000000 runs: 36.57 ms
    Average: 0.03657 microsecond
    Speed increase: 174.66%

  5. Loadmaster's idea of using the fast hash, ActivCode[0] + ActivCode[1]*0x100:
    1000000 runs: 39.53 ms
    Average: 0.03953 microsecond
    Speed increase: 153.53%

  6. Using a hashtable (Dictionary<string, MarketDataExchange>), as suggested by many:
    1000000 runs: 88.32 ms
    Average: 0.08832 microsecond
    Speed increase: 12.36%

  7. Using a binary search:
    1000000 runs: 1031 ms
    Average: 1.031 microseconds
    Speed increase: none (performance worsened)

Let me just say that it has been really cool to see how many different ideas people had on this simple problem. This was very interesting to me, and I'm quite thankful to everyone who has contributed and made a suggestion so far.

+3  A: 

I would put it in dictionary instead of using a switch statement. That being said, it may not make a difference. Or it might. See C# switch statement limitations - why?.

cletus
I think if you measured it you'd find that wouldn't yield a significant performance improvement.
Dan Tao
It will likely not make any difference because C# compiler can (and does) that on its own, if there are enough cases for it to be worthwhile.
Pavel Minaev
+6  A: 

I'd use a dictionary for the key value pairs and take advantage of the O(1) lookup time.

justinlatimer
If you profiled this against the switch statement above, you'd find basically even results (at least using the latest C# compiler and Microsoft's CLR you would).
Dan Tao
You would also pay for the hash code calculations.
Jonathan Allen
Heh, as I read somewhere else a little while ago: The 1 in O(1) may be a value that is higher than the total of O(n), depending on your n.
Quibblesome
C# will itself produce a hashtable-based implementation for string switches with sufficient number of branches.
Pavel Minaev
A: 

Put the cases in a sorted structure with non linear access (like a hash table). The switch that you have will have a linear time.

Nestor
+6  A: 

Do you have any statistics on which strings are more common? So that those could be checked first?

quip
That's definitely a good thought, though I believe with a `switch` statement of this length it won't actually matter. If the function used an `if`/`else` block instead, it would. I could be wrong about that; but anyway, I believe in our case all strings are going to be more or less equally common.
Dan Tao
+1 This was my first thought. @Dan are you sure that the strings are all equally likely. I don't know your project but I would assume that NYSE and NASDAQ are going to be searched much more than ARCA (which returned the Alberta roofing contractors association when googled).
Lawrence Barsanti
@Larry, to be honest, no, not sure at all. In fact if anything the most common code is likely to be the empty string, which represents an NBBO tick. But it appears to be a moot point, as our tests suggest that the order of the `switch` statement actually does not matter (this is consistent with the compiler converting it into a jump table).
Dan Tao
A: 

I would do nothing, since its a switch the compiler already optimizes it.

Woot4Moo
The compiler optimizes as best it can, but it can't be optimized for every specific set of data. I provided the exact strings and enums in the question for a reason.
Dan Tao
The compiler can't optimize the string comparisons as they are given.
Loadmaster
A: 

You can get a mild speed-up by ordering the codes according to which ones are most used.

But I agree with Cletus: the best speed-up I can think of would be to use a hash map with plenty of room (so that there are no collisions.)

Chip Uni
The MS C# compiler converts switch statements of this length to a jump table anyway; a hash map therefore doesn't actually provide a significant improvement (at least according to our measurements).
Dan Tao
+1  A: 

+1 for using a dictionary. Not necessarily for optimization, but it'd be cleaner.

I would probably use constants for the strings as well, though i doubt that'd buy you anything performance wise.

Josh
A: 

A couple of random thoughts, that may not all be applicable together:

Switch on the first character in the string, rather than the string itself, and do a sub-switch for strings which can contain more than one letter?

A hashtable would certainly guarantee O(1) retrieval, though it might not be faster for smaller numbers of comparisons.

Don't use strings, use enums or something like a flyweight instead. Using strings in this case seems a bit fragile anyway...

And if you really need it to be as fast as possible, why aren't you writing it in assembly? :)

kyoryu
The strings are what we get from a vendor. Otherwise I agree with you that it is not an ideal form of data. (Notice the function actually converts strings *into* enums -- which are used everywhere else in the application.) This isn't in assembly because, well, basically, we are only humans after all... our application has a huge amount of functionality and the demand for new features is quite extreme in terms of time constraints.
Dan Tao
Fair 'nuff. The assembly comment was in semi-snark, as if you really need to optimize the code that much, whether you should do it in managed code is a valid question... Anyway, based on the additional info, I'd still either look at a hashtable, or directly comparing the characters at appropriate indices instead of doing the full string comparison. Either way, putting a profiler block around it to actually determine speed is probably the first step.
kyoryu
Good point casting the string to a smallint would work on most of the supplied values eg. "QD" -> x'5144' -> 20844 , "Q" -> x'5100' -> 20736, etc. (Assuming ASCI and Null terminated strings). The empty string would 'cuase a bit of a problem as it would be '00' followed by somethin random, but you could just check for < 256.
James Anderson
You could just check the first character, which would be sufficient for any of the one character strings, or the empty string. In the case where a two-character string was possible, multiple two character strings are possible, or it could be a one or two-character string, only *then* check the second character.
kyoryu
You can't assume ASCII because .NET strings are always Unicode, so two bytes per char. On the bright side, this also means that null terminator is `00 00`.
Pavel Minaev
+8  A: 

If you know how often the various codes show up, the more common ones should go at the top of the list, so fewer comparisons are done. But let's assume you don't have that.

Assuming the ActivCode is always valid will of course speed things up. You don't need to test for null or the empty string, and you can leave off one test from the end of the switch. That is, test for everything except Y, and then return DIRECTEDGE if no match is found.

Instead of switching on the whole string, switch on its first letter. For the codes that have more letters, put a second test inside the switch case. Something like this:

switch(ActivCode[0])
{
   //etc.
   case 'B':
      if ( ActivCode.Length == 1 ) return MarketDataExchange.BSE; 
      else return MarketDataExchange.BATS;
      // etc.
}

It would be better if you could go back and change the codes so they are all single characters, because you would then never need more than one test. Better yet would be using the numerical value of the enum, so you can simply cast instead of having to switch/translate in the first place.

Auraseer
+1 for actually making a suggestion that yields a performance improvement! You're right that assuming valid data obviates the need for `if (ActivCode == null)`. However, notice the empty string is actually a valid code (`MarketDataExchange.NBBO`). Anyway, this is actually fairly close to the best option we came up with...
Dan Tao
+1, my first thought was 'use a trie'.
Annabelle
This throws an exception on the empty string case.
gWiz
Yeah, I didn't notice that empty string is a valid value. The obvious fix for that would be to do the tests in the other order-- switch() by the length first, then inside each case of nonzero length, switch() by the first character.
Auraseer
A: 

Can we cast the ActivCode to int and then use int in our case statements?

indy
The C# compiler won't allow it.
Loadmaster
Not directly, but you can always use bit-shifting to get the same result. Or unsafe code.
Pavel Minaev
+5  A: 

With a valid input could use

if (ActivCode.Length == 0)
    return MarketDataExchange.NBBO;

if (ActivCode.Length == 1)
    return (MarketDataExchange) (ActivCode[0]);

return (MarketDataExchange) (ActivCode[0] | ActivCode[1] << 8);
Courtney de Lautour
Assuming that you can change the values of the original enumeration, I think this would probably be the fastest solution by far.
Chris
+3  A: 

Change the switch to switch on the HashCode() of the strings.

tster
How much does that hash code calculation code you?
Jonathan Allen
A whole 'nother function call with several integer arithimatic functions is going to be faster than an average 5 character compares?
James Anderson
It depends. Are the character compares inlined by the compiler? I mean we are really talking about compiler specific micro-optimizations here.
tster
I was extremely skeptical of this suggestion, but trying it out on my machine actually cut the time in half compared to the straight `switch`. This one simple change even rivals the fastest solution we came up with (though doesn't quite match it). +1.
Dan Tao
However, there is alwasy the chance that this solution *will break* given two values which recieve the same hash code.
Chris
True, but for such a small set, I think we are safe.
tster
@Chris: Well, naturally one would check that first. I don't know if tster did or not, but I did and we were good to go.
Dan Tao
+1  A: 

Messy but using a combination of nested ifs and hard coding might just beat the optimiser:-

   if (ActivCode < "N") {
         // "" to "MW"
         if (ActiveCode < "BT") {
            // "" to "B"
            if (ActiveCode < "B") {
                // "" or "A"
                if (ActiveCode < "A") {
                      // must be ""
                     retrun MarketDataExchange.NBBO;
                } else {
                     // must be "A"
                    return MarketDataExchange.AMEX;
                }
            } else {
                // must be "B"
                return MarketDataExchange.BSE;
            }
         } else {
            // "BT" to "MW"
            if (ActiveCode < "MW") {
                // "BT" or "C"
                if (ActiveCode < "C") {
                      // must be "BT"
                     retrun MarketDataExchange.NBBO;
                } else {
                     // must be "C"
                    return MarketDataExchange.NSE;
                }
            } else {
            // must be "MV"
                return MarketDataExchange.CHX;
            }
         }
    } else {
        // "N" TO "Y"
         if (ActiveCode < "QD") {
            // "N" to "Q"
            if (ActiveCode < "Q") {
                // "N" or "PA"
                if (ActiveCode < "PA") {
                      // must be "N"
                     retrun MarketDataExchange.NYSE;
                } else {
                     // must be "PA"
                    return MarketDataExchange.ARCA;
                }
            } else {
                // must be "Q"
                return MarketDataExchange.NASDAQ;
            }
         } else {
            // "QD" to "Y"
            if (ActiveCode < "X") {
                // "QD" or "W"
                if (ActiveCode < "W") {
                      // must be "QD"
                     retrun MarketDataExchange.NASDAQ_ADF;
                } else {
                     // must be "W"
                    return MarketDataExchange.CBOE;
                }
            } else {
            // "X" or "Y"
                if (ActiveCode < "Y") {
                      // must be "X"
                     retrun MarketDataExchange.PHLX;
                } else {
                     // must be "Y"
                    return MarketDataExchange.DIRECTEDGE;
                }
            }
         }
    }

This gets the right function with three or four compares. I wouldnt even think of doing this for real unless your piece of code is expected to run several times a second!

You further otimise it so that only single character compares occurred. e.g. replace '< "BT" ' with '>= "B" ' -- ever so slightly faster and even less readable!

James Anderson
Several times a second? Try thousands of times a second ;) Believe it or not, we actually did try a binary search... turns out a switch is significantly faster, probably because instead of three or four compares it only requires one (on the hash, since the compiler produces a jump table for the switch).
Dan Tao
To jumpy, I guess. Looks like jump prediction nightmare (though things have certainly improved since I last dablled with it). It would allow incorporating probabilistic optimizations, though.
peterchen
+2  A: 
  1. Avoid all string comparisons.
  2. Avoid looking at more than a single character (ever)
  3. Avoid if-else since I want the compiler to be able optimize the best it can
  4. Try to get the result in a single switch jump

code:

public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
    if (ActivCode == null) return MarketDataExchange.NONE;
    int length = ActivCode.Length;
    if (length == 0) return MarketDataExchange.NBBO;

    switch (ActivCode[0]) {
        case 'A': return MarketDataExchange.AMEX;
        case 'B': return (length == 2) ? MarketDataExchange.BATS : MarketDataExchange.BSE;
        case 'C': return MarketDataExchange.NSE;
        case 'M': return MarketDataExchange.CHX;
        case 'N': return MarketDataExchange.NYSE;
        case 'P': return MarketDataExchange.ARCA;
        case 'Q': return (length == 2) ? MarketDataExchange.NASDAQ_ADF : MarketDataExchange.NASDAQ;
        case 'W': return MarketDataExchange.CBOE;
        case 'X': return MarketDataExchange.PHLX;
        case 'Y': return MarketDataExchange.DIRECTEDGE;
        default:  return MarketDataExchange.NONE;
    }
}
tster
There might be a way to do an arithmetic trick to avoid the branching on the B and Q case. For example, `switch(ActiveCode[0] + length + length)` then switch on whatever you calculate that to be. The problem is avoiding switch condition collisions.
tster
switch (ActiveCode[0] / length) gives you the same switch except the B and Q cases with length == 2 give you 33 and 40 respectively. Dunno if it's worth the divide though.
Ron Warholic
+4  A: 

I'd extrapolate tster's reply to "switch over a custom hash function", assuming that the code generator creates a lookup table, or - failing that - building the lookup table myself.

The custom hash function should be simple, e.g.:

(int)ActivCode[0]*2 + ActivCode.Length-1

This would require a table of 51 elements, easily kept in L1 cache, under the following assumptions:

  • Input data must already be validated
  • empty string must be handled sepsarately
  • no two-character-codes start with the same character
  • adding new cases is hard

the empty string case could be incorporated if you could use an unsafe access to ActivCode[0] yielding the '\0' terminator.

peterchen
+19  A: 

I'd roll my own fast hash function and use an integer switch statement to avoid string comparisons:

int  h = 0;  

// Compute fast hash: A[0] + A[1]*0x100
if (ActivCode.Length > 0)
    h += (int) ActivCode[0];
if (ActivCode.Length > 1)
    h += (int) ActivCode[1] << 8;  

// Find a match
switch (h)
{
    case 0x0000:  return MarketDataExchange.NBBO;        // ""
    case 0x0041:  return MarketDataExchange.AMEX;        // "A"
    case 0x0042:  return MarketDataExchange.BSE;         // "B"
    case 0x5442:  return MarketDataExchange.BATS;        // "BT"
    case 0x0043:  return MarketDataExchange.NSE;         // "C"
    case 0x574D:  return MarketDataExchange.CHX;         // "MW"
    case 0x004E:  return MarketDataExchange.NYSE;        // "N"
    case 0x4150:  return MarketDataExchange.ARCA;        // "PA"
    case 0x0051:  return MarketDataExchange.NASDAQ;      // "Q"
    case 0x4451:  return MarketDataExchange.NASDAQ_ADF;  // "QD"
    case 0x0057:  return MarketDataExchange.CBOE;        // "W"
    case 0x0058:  return MarketDataExchange.PHLX;        // "X"
    case 0x0059:  return MarketDataExchange.DIRECTEDGE;  // "Y"
    default:      return MarketDataExchange.NONE;
}

My tests show that this is about 4.5 times faster than the original code.

If C# had a preprocessor, I'd use a macro to form the case constants.

This technique is faster than using a hash table and certainly faster than using string comparisons. It works for up to four-character strings with 32-bit ints, and up to 8 characters using 64-bit longs.

Loadmaster
Now replace `switch` with a jump table, and it should squeeze a little bit more out of that!
Pavel Minaev
Very nice. I followed a very similar technique with a simple checksum and achieved similar performance. But yours is better.
Steve Wortham
I did achieve an 8% improvement by tweaking the if statement... int h; if (ActivCode.Length == 1) h = (int)ActivCode[0]; else if (ActivCode.Length == 2) h = (int)ActivCode[1] << 8 + (int)ActivCode[0]; else h = 0;
Steve Wortham
I was not able to verify the 4.5x speed improvement claim. Instead I only got 3x improvement, and is practically the same as my answer. My results when invoking the reference method 50,000,000 times for each case took 1min 30sec; this method took 27.7sec; my method took 28.5sec.
gWiz
This is smart, and it is certainly an improvement on the switch. However, it is not the fastest solution posted here so far. I am going to post some test results later today...
Dan Tao
+15  A: 

Assuming every input will be a valid ActivCode, that you can change the enumeration values and highly coupled to the GetHashCode implementation:

enum MarketDataExchange
{
    NONE,
    NBBO = 371857150,
    AMEX = 372029405,
    BSE = 372029408,
    BATS = -1850320644,
    NSE = 372029407,
    CHX = -284236702,
    NYSE = 372029412,
    ARCA = -734575383,
    NASDAQ = 372029421,
    NASDAQ_ADF = -1137859911,
    CBOE = 372029419,
    PHLX = 372029430,
    DIRECTEDGE = 372029429
}

public static MarketDataExchange GetMarketDataExchange(string ActivCode)
{
    if (ActivCode == null) return MarketDataExchange.NONE;

    return (MarketDataExchange)ActivCode.GetHashCode();
}
João Angelo
Awesome answer.
Dan Tao
Eric Lippert warned about using string hashes here: http://blogs.msdn.com/ericlippert/archive/2005/10/24/do-not-use-string-hashes-for-security-purposes.aspx - those values may change in a future version of the CLR.
Jason
@Jason: I am aware of this. In fact, the values that João Angelo has provided here did not work for me; I had to figure out the hash codes from the CLR version on my machine first. It would be a maintenance headache, having to revisit the function every time we upgraded the CLR; but since this question was about optimization, not maintainability, I have to tip my hat to João for coming up with something quite blazingly fast.
Dan Tao
If you're worried about CLR versions tripping you up, you can always devise your own hash function as suggested in some of the other answers.
Mark Ransom
I wonder though if you're trading a speedup in one area for a slowdown in another. Will the code be less efficient in the places where you use these enumerations?
Mark Ransom
@Jason: Completely agree, but as Dan explained even better than I would this was just about a fun optimization challenge.
João Angelo
+2  A: 

If the enumeration values are arbitrary you could do this...

public static MarketDataExchange GetValue(string input)
{
    switch (input.Length)
    {
        case 0: return MarketDataExchange.NBBO;
        case 1: return (MarketDataExchange)input[0];
        case 2: return (MarketDataExchange)(input[0] << 8 | input[1]);
        default: return MarketDataExchange.None;
    }
}

... if you want to go totally nuts you can also use an unsafe call with pointers as noted by Pavel Minaev ... The pure cast version above is faster than this unsafe version.

unsafe static MarketDataExchange GetValue(string input)
{
    if (input.Length == 1)
        return (MarketDataExchange)(input[0]);
    fixed (char* buffer = input)
        return (MarketDataExchange)(buffer[0] << 8 | buffer[1]);
}


public enum MarketDataExchange
{
    NBBO = 0x00, //
    AMEX = 0x41, //A
    BSE = 0x42, //B
    BATS = 0x4254, //BT
    NSE = 0x43, //C
    CHX = 0x4D57, //MW
    NYSE = 0x4E, //N
    ARCA = 0x5041, //PA
    NASDAQ = 0x51, //Q
    NASDAQ_ADF = 0x5144, //QD
    CBOE = 0x57, //W
    PHLX = 0x58, //X
    DIRECTEDGE = 0x59, //Y

    None = -1
}
Matthew Whited
If unsafe code is allowed, you can optimize this even further: get a pointer to string data and return it directly - `fixed (MarketDataExchange* p = (MarketDataExchange*)(char*)input) { return *p; }`. This can be done for both 1-char and 2-char strings without additional checks, because a 1-char string will have `\0` (i.e `00 00`) in string representation immediately after data, and this is accessible via pointer. So only `Length==0` and `Length==2` needs to be checked, and if we assume valid input, then only `Length==0`.
Pavel Minaev
That's a really good point... I will add that solution as well.
Matthew Whited
The point of using `fixed` was not so much to avoid the index check, as it was to avoid bit-shifting operators (I'm not sure JIT can optimize them away). The edit you made to answer will solely avoid the index check, however - you really need that pointer cast there to get the most out of it.
Pavel Minaev
your sample as you had it would not compile for me. Eitherway a shift should be very fast. There are plenty of other issues in doing this such as what happens when you pass in an invaild value such as "Z" or even "a"
Matthew Whited
I just tried it again without the shifting and the fact that the string is stored as UTF-16 by .net would also make me change the enumeration values. But this is well past the point of crazy anyway.
Matthew Whited
The unsafe version is slower than the shift/cast version. If there is interest I will post the results on my blog.
Matthew Whited
+1  A: 

All your strings are at most 2 chars long, and ASCII, so we can use 1 byte per char. Furthermore, more likely than not, they also never can have \0 appear in them (.NET string allows for embedded null characters, but many other things don't). With that assumption, we can null-pad all your strings to be exactly 2 bytes each, or an ushort:

""   -> (byte) 0 , (byte) 0   -> (ushort)0x0000
"A"  -> (byte)'A', (byte) 0   -> (ushort)0x0041
"B"  -> (byte)'B', (byte) 0   -> (ushort)0x0042
"BT" -> (byte)'B', (byte)'T'  -> (ushort)0x5442

Now that we have a single integer in a relatively (64K) short range, we can use a lookup table:

MarketDataExchange[] lookup = {
    MarketDataExchange.NBBO, 
    MarketDataExchange.NONE, 
    MarketDataExchange.NONE, 
    ...
    /* at index 0x041 */
    MarketDataExchange.AMEX,
    MarketDataExchange.BSE,
    MarketDataExchange.NSE,
    ...
};

Now, obtaining the value given a string is:

public static unsafe MarketDataExchange GetMarketDataExchange(string s)
{
   // Assume valid input
   if (s.Length == 0) return MarketDataExchange.NBBO;

   // .NET strings always have '\0' after end of data - abuse that
   // to avoid extra checks for 1-char strings. Skip index checks as well.
   ushort hash;
   fixed (char* data = s)
   {
       hash = (ushort)data[0] | ((ushort)data[1] << 8);
   }

   return lookup[hash];
}
Pavel Minaev
+4  A: 

Forgive me if I get something wrong here, I'm extrapolating from my knowledge of C++. For example, if you take ActivCode[0] of an empty string, in C++ you get a character whose value is zero.

Create a two dimensional array which you initialize once; the first dimension is the length of the code, the second is a character value. Populate with the enumeration value you'd like to return. Now your entire function becomes:

public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
    return LookupTable[ActivCode.Length][ActivCode[0]];
}

Lucky for you all the two-character codes are unique in the first letter compared to the other two-character codes.

Mark Ransom
Of course this only works for valid input, since it only checks one of the two characters.
Mark Ransom
+1 for coming up with the same solution as my colleague and me -- and a good one, in my opinion ;) I guess great minds think alike.
Dan Tao
Yes, same basic principle as my solution. Thumbs up!
gWiz
I should probably point out that there is one difference in our approaches: although in C++ you may have the luxury of checking `ActivCode[0]` and getting 0, in C# that will throw an exception. So we had to add `if (ActivCode.Length < 1) return MarketDataExchange.NBBO;` first.
Dan Tao
+3  A: 

Trade memory for speed by pre-populating an index table to leverage simple pointer arithmetic.

public class Service 
{
    public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
    {
        int x = 65, y = 65;
        switch(ActivCode.Length)
        {
            case 1:
                x = ActivCode[0];
                break;
            case 2:
                x = ActivCode[0];
                y = ActivCode[1];
                break;
        }
        return _table[x, y];
    }

    static Service()
    {
        InitTable();
    }

    public static MarketDataExchange[,] _table = 
        new MarketDataExchange['Z','Z'];

    public static void InitTable()
    {
        for (int x = 0; x < 'Z'; x++)
            for (int y = 0; y < 'Z'; y++)
                _table[x, y] = MarketDataExchange.NONE;

        SetCell("", MarketDataExchange.NBBO);
        SetCell("A", MarketDataExchange.AMEX);
        SetCell("B", MarketDataExchange.BSE);
        SetCell("BT", MarketDataExchange.BATS);
        SetCell("C", MarketDataExchange.NSE);
        SetCell("MW", MarketDataExchange.CHX);
        SetCell("N", MarketDataExchange.NYSE);
        SetCell("PA", MarketDataExchange.ARCA);
        SetCell("Q", MarketDataExchange.NASDAQ);
        SetCell("QD", MarketDataExchange.NASDAQ_ADF);
        SetCell("W", MarketDataExchange.CBOE);
        SetCell("X", MarketDataExchange.PHLX);
        SetCell("Y", MarketDataExchange.DIRECTEDGE);
    }

    private static void SetCell(string s, MarketDataExchange exchange)
    {
        char x = 'A', y = 'A';
        switch(s.Length)
        {
            case 1:
                x = s[0];
                break;
            case 2:
                x = s[0];
                y = s[1];
                break;
        }
        _table[x, y] = exchange;
    }
}

Make the enum byte-based to save a little space.

public enum MarketDataExchange : byte
{
    NBBO, AMEX, BSE, BATS, NSE, CHX, NYSE, ARCA, 
    NASDAQ, NASDAQ_ADF, CBOE, PHLIX, DIRECTEDGE, NONE
}
gWiz
+1 for creativity.
tster
Yes, very nice. Practically the same as my own (and Mark Ransom's) idea, but with some thoughtful modifications.
Dan Tao