tags:

views:

1116

answers:

7

Any thoughts on the efficiency of this? ...

CommentText.ToCharArray().Where(c => c >= 'A' && c <= 'Z').Count()
+1  A: 

You're only counting standard ASCII and not ÃÐÊ etc.

How about

CommentText.ToCharArray().Where(c => Char.IsUpper(c)).Count()
Dead account
Thanks for that - I was looking for c.Upper in intellisense - didn't think of the other.
porum
+12  A: 

Ok, just knocked up some code to time your method against this:

int count = 0;
for (int i = 0; i < s.Length; i++)
{
    if (char.IsUpper(s[i])) count++;
}

The result:

Yours: 19737 ticks

Mine: 118 ticks

Pretty big difference! Sometimes the most straight-forward way is the most efficient.

Edit

Just out of interest, this:

int count = s.Count(c => char.IsUpper(c));

Comes in at at around 2500 ticks. So for a "Linqy" one-liner it's pretty quick.

Matt Hamilton
+1 nice old school solution :)
Dead account
D'oh! Pipped at the post, almost exactly the same solution. Well done Matt, damn stupid slow fingers . . .
Binary Worrier
By the way, +1 mate, you earned it :)
Binary Worrier
Thanks very much - changing my code now :-)
porum
Could you try the same test having taken out the unnecessary call to ToCharArray?
Jon Skeet
@Binary right batch 'atcha, since I hadn't considered the "foreach" approach.
Matt Hamilton
@Jon Without the ToCharArray it's around 18000 ticks, so a tad faster.
Matt Hamilton
@Jon I've updated the post - removing the "Where" was the clincher.
Matt Hamilton
Make it faster! Remove the index checking by fixing the pointer, then if you define upper case, then you could probably do away with that branch as well. i.e. if upper case -eq c > 0x60, then it's easy, otherwise it's trickier.
John Leidegren
Tobias Hertkorn
@John "Remove the index checking by fixing the pointer"? Are you talking C/C++ here?
Matt Hamilton
A pointer could be used if fixing the object inside an unsafe block. That would make the loop slightly faster, but I think that the overhead of fixing the object makes the overall operation slower.
Guffa
@Matt Does s.Count(char.IsUpper) give the same timing as s.Count(c => char.IsUpper(c)) ?
porum
@porum Yes, performs the same (and the former is easier to read so I'd go with that).
Matt Hamilton
+2  A: 

Without even testing I'd say

int count = 0;
foreach (char c in commentText)
{
    if (Char.IsUpper(c))
        count++;
}

is faster, off now to test it.

Binary Worrier
Yep that's on par (or perhaps even a few ticks faster) than my "for" loop.
Matt Hamilton
Not bothering to test it, as Matt Hamilton was considerate enough to do this for me :)
Binary Worrier
+6  A: 

First there is no reason you need to call ToCharArray() since, assuming CommentText is a string it is already an IEnumerable<char>. Second, you should probably be calling char.IsUpper instead of assuming you are only dealing with ASCII values. The code should really look like,

CommentText.Count(char.IsUpper)

Third, if you are worried about speed there isn't much that can beat the old for loop,

int count = 0;
for (int i = 0; i < CommentText.Length; i++) 
   if (char.IsUpper(CommentText[i]) count++;

In general, calling any method is going to be slower than inlining the code but this kind of optimization should only be done if you are absolutely sure this is the bottle-neck in your code.

chuckj
Dude, should "there is much" be "there isn't much"?
Binary Worrier
Doh, yes. Fixed.
chuckj
Thanks for your answer. Very succinct linq code.
porum
+2  A: 

What you are doing with that code is to create a collection with the characters, then create a new collection containing only the uppercase characters, then loop through that collection only to find out how many there are.

This will perform better (but still not quite as good as a plain loop), as it doesn't create the intermediate collections:

CommentText.Count(c => Char.IsUpper(c))

Edit: Removed the ToCharArray call also, as Matt suggested.

Guffa
+1 for moving the predicate into the Count. Take away the ToCharArray as well and it speeds up dramatically.
Matt Hamilton
Good point. I did something similar before and then the ToCharArray was needed, but not in this case. :)
Guffa
+1  A: 

I've got this

Regex x = new Regex("[A-Z]{1}", 
  RegexOptions.Compiled | RegexOptions.CultureInvariant);
int c = x.Matches(s).Count;

but I don't know if its particularly quick. It won't get special chars either, I s'pose

EDIT:

Quick comparison to this question's answer. Debug in vshost, 10'000 iterations with the string:
aBcDeFGHi1287jKK6437628asghwHllmTbynerA

  • The answer: 20-30 ms
  • The regex solution: 140-170 ms
flq
Thanks for the comparison - I wondered how a regex would stack up.
porum
+2  A: 

sorry wrong comment

endeffects