views:

164

answers:

3

I am working a school project to implement a Huffman code on text. The first part of course requires a frequency analysis on the text. Is there a better way aside from a giant switch and an array of counters to do it?

ie:

int[] counters

for(int i = 0; i <inString.length(); i++)
{
switch(inString[i])
    case 'A':
    counters[0]++;
.
.
. 

I would like to do all alpha-numeric characters and punctuation. I am using c++.

+9  A: 

Why not:

int counters[256] = {0};
for(int i = 0; i <inString.length(); i++)
    counters[inString[i]]++;
}


std::cout << "Count occurences of \'a\'" << counters['a'] << std::endl;
Alexander Gessler
That is very interesting, I will give it a shot thanks.
Maynza
+6  A: 

You can use an array indexed by character:

int counters[256];
for (int i = 0; i < inString.length(); i++) {
    counters[(unsigned char)inString[i]]++;
}

You will also want to initialise your counters array to zero, of course.

Greg Hewgill
And for those of us playing the optimization game at home for fun, `for (int i = inString.length()-1; i >= 0 ; i--)` instead.
Amber
@Dav:if you want to optimize, lift the call to `inString.length()` out of the loop instead. Counting backwards is more often counterproductive, simply because your cache may not expect that -- and a single cache miss will cost more than a lot of comparisons.
Jerry Coffin
It's more the fact that moving it from the conditional to the initializer results in fewer function calls to `.length()`. But yes, moving it out of the loop also works fine.
Amber
I usually write that as `for (int i = 0, imax = inString.length(); i < imax; i++)`.
Roland Illig
+2  A: 

using a map seems completely applicable:

map<char,int> chcount;
for(int i=0; i<inString.length(); i++){
  t=inString[i];
  chcount[i]? chcount[i]++ : chcount[i]=1;
}
dagoof
This is particularly true if you venture beyond the world of nationalized character sets into the big, wide world of Unicode.
Jerry Coffin