(My apologies, this is a 2nd post to http://stackoverflow.com/questions/3390175/most-efficient-way-to-determine-if-a-string-length-0 but I can't figure out how to reply to people's answers, my reply becomes posted as an 'answer')
Ideally, what I'm looking for is the most efficient algorithm to do the following (which will be called 100million+ times). I'm using C# 4.0
Turn the string: "A B C D E " into the array: string["A","B","C","D","E"]
My algorithm is as follows:
public string[] SplitOnMultiSpaces(string text)
{
if (string.IsNullOrEmpty(text)) return new string[0];
var split = text.Split(' ');
int length = split.Length;
var data = new string[length];
int index = 0;
for (int i = 0; i<length; i++)
{
if (split[i].Length != 0)
{
data[index++] = split[i];
}
}
return data;
}
My problem is when I profile this against 100,000 strings, it takes 1.04 seconds to execute.
If I comment out the "if (split[i].Length != 0)" check, it takes only 0.2 seconds.
Can anybody tell me why this (simple) query against the string is taking 80% of the TOTAL execution time? (Especially, since I expected other areas to use more CPU) The only idea I've come up w/ is C# is trying to count the string length, which people tell me is not the case (that it's more like VB strings I guess?). But that wouldn't make sense for the time overhead.
I've considered trying to see if split[i][0] exists, but relying on an exception slows things WAAAAAAY down.
P.S. -- My algorithm also suffers in that the returned array is, more often than not, bigger than it needs to be, but that doesn't seem to be too much of an overhead.