2009-12-04 UPDATE: For profiling results on a number of the suggestions posted here, see below!
The Question
Consider the following very harmless, very straightforward method, which uses a switch
statement to return a defined enum value:
public static MarketDataExchange GetMarketDataExchange(string ActivCode) {
if (ActivCode == null) return MarketDataExchange.NONE;
switch (ActivCode) {
case "": return MarketDataExchange.NBBO;
case "A": return MarketDataExchange.AMEX;
case "B": return MarketDataExchange.BSE;
case "BT": return MarketDataExchange.BATS;
case "C": return MarketDataExchange.NSE;
case "MW": return MarketDataExchange.CHX;
case "N": return MarketDataExchange.NYSE;
case "PA": return MarketDataExchange.ARCA;
case "Q": return MarketDataExchange.NASDAQ;
case "QD": return MarketDataExchange.NASDAQ_ADF;
case "W": return MarketDataExchange.CBOE;
case "X": return MarketDataExchange.PHLX;
case "Y": return MarketDataExchange.DIRECTEDGE;
}
return MarketDataExchange.NONE;
}
My colleague and I batted around a few ideas today about how to actually make this method faster, and we came up with some interesting modifications that did in fact improve its performance rather significantly (proportionally speaking, of course). I'd be interested to know what sorts of optimizations anyone else out there can think up that might not have occurred to us.
Right off the bat, let me just offer a quick disclaimer: this is for fun, and not to fuel the whole "to optimize or not to optimize" debate. That said, if you count yourself among those who dogmatically believe "premature optimization is the root of all evil," just be aware that I work for a high-frequency trading firm, where everything needs to run absolutely as fast as possible--bottleneck or not. So, even though I'm posting this on SO for fun, it isn't just a huge waste of time, either.
One more quick note: I'm interested in two kinds of answers--those that assume every input will be a valid ActivCode (one of the strings in the switch
statement above), and those that do not. I am almost certain that making the first assumption allows for further speed improvements; anyway, it did for us. But I know that improvements are possible either way.
The Results
Well, it turns out that the fastest solution so far (that I've tested) came from João Angelo, whose suggestion was actually very simple, yet extremely clever. The solution that my coworker and I had devised (after trying out several approaches, many of which were thought up here as well) came in second place; I was going to post it, but it turns out that Mark Ransom came up with the exact same idea, so just check out his answer!
Since I ran these tests, some other users have posted even newer ideas... I will test them in due time, when I have a few more minutes to spare.
I ran these tests on two different machines: my personal computer at home (a dual-core Athlon with 4 Gb RAM running Windows 7 64-bit) and my development machine at work (a dual-core Athlon with 2 Gb RAM running Windows XP SP3). Obviously, the times were different; however, the relative times, meaning, how each method compared to every other method, were the same. That is to say, the fastest was the fastest on both machines, etc.
Now for the results. (The times I'm posting below are from my home computer.)
But first, for reference--the original switch statement:
1000000 runs: 98.88 ms
Average: 0.09888 microsecond
Fastest optimizations so far:
João Angelo's idea of assigning values to the enums based on the hash codes of the ActivCode strings and then directly casing
ActivCode.GetHashCode()
toMarketDataExchange
:
1000000 runs: 23.64 ms
Average: 0.02364 microsecond
Speed increase: 329.90%My colleague's and my idea of casting
ActivCode[0]
to anint
and retrieving the appropriateMarketDataExchange
from an array initialized on startup (this exact same idea was suggested by Mark Ransom):
1000000 runs: 28.76 ms
Average: 0.02876 microsecond
Speed increase: 253.13%tster's idea of switching on the output of
ActivCode.GetHashCode()
instead ofActivCode
:
1000000 runs: 34.69 ms
Average: 0.03469 microsecond
Speed increase: 185.04%The idea, suggested by several users including Auraseer, tster, and kyoryu, of switching on
ActivCode[0]
instead ofActivCode
:
1000000 runs: 36.57 ms
Average: 0.03657 microsecond
Speed increase: 174.66%Loadmaster's idea of using the fast hash,
ActivCode[0] + ActivCode[1]*0x100
:
1000000 runs: 39.53 ms
Average: 0.03953 microsecond
Speed increase: 153.53%Using a hashtable (
Dictionary<string, MarketDataExchange>
), as suggested by many:
1000000 runs: 88.32 ms
Average: 0.08832 microsecond
Speed increase: 12.36%Using a binary search:
1000000 runs: 1031 ms
Average: 1.031 microseconds
Speed increase: none (performance worsened)
Let me just say that it has been really cool to see how many different ideas people had on this simple problem. This was very interesting to me, and I'm quite thankful to everyone who has contributed and made a suggestion so far.