tags:

views:

93

answers:

3
class CounterDict<TKey>
{
    public Dictionary<TKey, int> _dict = new Dictionary<TKey, int>();

    public void Add(TKey key)
    {
        if(_dict.ContainsKey(key))
            _dict[key]++;
        else
        {
            _dict.Add(key, 1);
        }
    }
}

class Program
{
    static void Main(string[] args)
    {
        string line =  "The woods decay the woods decay and fall.";

        CounterDict<string> freq = new CounterDict<string>();
        foreach (string item in line.Split())
        {
            freq.Add(item.Trim().ToLower());
        }

        foreach (string key in freq._dict.Keys)
        {
            Console.WriteLine("{0}:{1}",key,freq._dict[key]);
        }           
    }
}

I want to calculate number of occurences of all the words in a string.
I think above code will be slow at this task because of (look into the Add function) :

    if(_dict.ContainsKey(key))
    _dict[key]++;
    else
    {
        _dict.Add(key, 1);
    }

Also, is keeping _dict__ public good practice? (I don't think it is.)

How should I modify this or change it totally to do the job?

+3  A: 

How about this:

Dictionary<string, int> words = new Dictionary<string, int>();
string input = "The woods decay the woods decay and fall.";
foreach (Match word in Regex.Matches(input, @"\w+", RegexOptions.ECMAScript))
{
    if (!words.ContainsKey(word.Value))
    {
        words.Add(word.Value, 1);
    }
    else
    {
        words[word.Value]++;
    }
}

Principal point was replacing .Split by a regular expression, so you don't need to keep a big string array in memory and you can work with one item at time.

Rubens Farias
But what about "non string" keys. I am planning to extend this to other key types also.
TheMachineCharmer
"Or there are REGEXs for non-strings also?" :)
TheMachineCharmer
what do you meant by 'non-strings'? that `\w+` means `[a-zA-Z_0-9]` (or 'letters from A to Z, underscore and numbers')
Rubens Farias
+2  A: 

From the msdn documentation:

    // When a program often has to try keys that turn out not to
    // be in the dictionary, TryGetValue can be a more efficient 
    // way to retrieve values.
    string value = "";
    if (openWith.TryGetValue("tif", out value))
    {
        Console.WriteLine("For key = \"tif\", value = {0}.", value);
    }
    else
    {
        Console.WriteLine("Key = \"tif\" is not found.");
    }

Haven't tested for it myself, but it might improve your efficiency.

Asaf
A: 

Here are some ways to do count occurances of strings.

SwDevMan81