ansaurus

Question

Fastest way to count number of uppercase characters in c#

Answer 1

+1 A:

You're only counting standard ASCII and not ÃÐÊ etc.

How about

CommentText.ToCharArray().Where(c => Char.IsUpper(c)).Count()

Dead account 2009-03-04 08:38:57

Thanks for that - I was looking for c.Upper in intellisense - didn't think of the other.

porum 2009-03-04 08:46:50

Answer 2

+12 A:

Ok, just knocked up some code to time your method against this:

int count = 0;
for (int i = 0; i < s.Length; i++)
{
    if (char.IsUpper(s[i])) count++;
}

The result:

Yours: 19737 ticks

Mine: 118 ticks

Pretty big difference! Sometimes the most straight-forward way is the most efficient.

Edit

Just out of interest, this:

int count = s.Count(c => char.IsUpper(c));

Comes in at at around 2500 ticks. So for a "Linqy" one-liner it's pretty quick.

Matt Hamilton 2009-03-04 08:39:11

+1 nice old school solution :)

Dead account 2009-03-04 08:40:55

D'oh! Pipped at the post, almost exactly the same solution. Well done Matt, damn stupid slow fingers . . .

Binary Worrier 2009-03-04 08:42:42

By the way, +1 mate, you earned it :)

Binary Worrier 2009-03-04 08:44:05

Thanks very much - changing my code now :-)

porum 2009-03-04 08:45:49

Could you try the same test having taken out the unnecessary call to ToCharArray?

Jon Skeet 2009-03-04 08:47:06

@Binary right batch 'atcha, since I hadn't considered the "foreach" approach.

Matt Hamilton 2009-03-04 08:47:26

@Jon Without the ToCharArray it's around 18000 ticks, so a tad faster.

Matt Hamilton 2009-03-04 08:53:34

@Jon I've updated the post - removing the "Where" was the clincher.

Matt Hamilton 2009-03-04 08:56:31

Make it faster! Remove the index checking by fixing the pointer, then if you define upper case, then you could probably do away with that branch as well. i.e. if upper case -eq c > 0x60, then it's easy, otherwise it's trickier.

John Leidegren 2009-03-04 09:13:36

Tobias Hertkorn 2009-03-04 09:15:17

@John "Remove the index checking by fixing the pointer"? Are you talking C/C++ here?

Matt Hamilton 2009-03-04 09:19:05

A pointer could be used if fixing the object inside an unsafe block. That would make the loop slightly faster, but I think that the overhead of fixing the object makes the overall operation slower.

Guffa 2009-03-04 09:28:44

@Matt Does s.Count(char.IsUpper) give the same timing as s.Count(c => char.IsUpper(c)) ?

porum 2009-03-04 09:38:45

@porum Yes, performs the same (and the former is easier to read so I'd go with that).

Matt Hamilton 2009-03-04 09:39:26

Answer 3

+2 A:

Without even testing I'd say

int count = 0;
foreach (char c in commentText)
{
    if (Char.IsUpper(c))
        count++;
}

is faster, off now to test it.

Binary Worrier 2009-03-04 08:41:32

Yep that's on par (or perhaps even a few ticks faster) than my "for" loop.

Matt Hamilton 2009-03-04 08:43:13

Not bothering to test it, as Matt Hamilton was considerate enough to do this for me :)

Binary Worrier 2009-03-04 08:43:23

Answer 4

+6 A:

First there is no reason you need to call ToCharArray() since, assuming CommentText is a string it is already an IEnumerable<char>. Second, you should probably be calling char.IsUpper instead of assuming you are only dealing with ASCII values. The code should really look like,

CommentText.Count(char.IsUpper)

Third, if you are worried about speed there isn't much that can beat the old for loop,

int count = 0;
for (int i = 0; i < CommentText.Length; i++) 
   if (char.IsUpper(CommentText[i]) count++;

In general, calling any method is going to be slower than inlining the code but this kind of optimization should only be done if you are absolutely sure this is the bottle-neck in your code.

chuckj 2009-03-04 08:45:25

Dude, should "there is much" be "there isn't much"?

Binary Worrier 2009-03-04 08:48:37

Doh, yes. Fixed.

chuckj 2009-03-04 08:51:04

Thanks for your answer. Very succinct linq code.

porum 2009-03-04 08:58:18

Answer 5

+2 A:

What you are doing with that code is to create a collection with the characters, then create a new collection containing only the uppercase characters, then loop through that collection only to find out how many there are.

This will perform better (but still not quite as good as a plain loop), as it doesn't create the intermediate collections:

CommentText.Count(c => Char.IsUpper(c))

Edit: Removed the ToCharArray call also, as Matt suggested.

Guffa 2009-03-04 08:45:46

+1 for moving the predicate into the Count. Take away the ToCharArray as well and it speeds up dramatically.

Matt Hamilton 2009-03-04 08:58:21

Good point. I did something similar before and then the ToCharArray was needed, but not in this case. :)

Guffa 2009-03-04 09:25:18

Answer 6

+1 A:

I've got this

Regex x = new Regex("[A-Z]{1}", 
  RegexOptions.Compiled | RegexOptions.CultureInvariant);
int c = x.Matches(s).Count;

~~but I don't know if its particularly quick. It won't get special chars either, I s'pose~~

EDIT:

Quick comparison to this question's answer. Debug in vshost, 10'000 iterations with the string:
aBcDeFGHi1287jKK6437628asghwHllmTbynerA

The answer: 20-30 ms
The regex solution: 140-170 ms

flq 2009-03-04 09:27:51

Thanks for the comparison - I wondered how a regex would stack up.

porum 2009-03-04 12:36:19

Answer 7

+2 A:

sorry wrong comment

endeffects 2009-12-14 17:48:56

ansaurus

tags:

views:

answers:

Fastest way to count number of uppercase characters in c#

related questions