tags:

views:

356

answers:

2

Hi , i got richtextBox control in form and a text file. I am getting text file to array and getting richtextbox1.text to other array than compare it and count words matching. But for example there are two "name" word in text file and three "and" word in richtextbox .. So if there is two same word in text file in richtextbox it cant be 3 or higher after 2 , it must be wrong word so it must not be counted. But HashSet is counting unique values only not looking for duplicates in text file. I wanna compare every word in text file with words in RichTextBox .. (sorr for my english.)

My Codes here ;

        StreamReader sr = new StreamReader("c:\\test.txt",Encoding.Default);
        string[] word = sr.ReadLine().ToLower().Split(' ');
        sr.Close();
        string[] word2 = richTextBox1.Text.ToLower().Split(' ');
        var set1 = new HashSet<string>(word);
        var set2 = new HashSet<string>(word2);
        set1.IntersectWith(set2);

        MessageBox.Show(set1.Count.ToString());
+1  A: 

You need the counts to be the same? You need to count the words, then...

    static Dictionary<string, int> CountWords(string[] words) {
        // use (StringComparer.{your choice}) for case-insensitive
        var result = new Dictionary<string, int>();
        foreach (string word in words) {
            int count;
            if (result.TryGetValue(word, out count)) {
                result[word] = count + 1;
            } else {
                result.Add(word, 1);
            }
        }
        return result;
    }
        ...
        var set1 = CountWords(word);
        var set2 = CountWords(word2);

        var matches = from val in set1
                      where set2.ContainsKey(val.Key)
                         && set2[val.Key] == val.Value
                      select val.Key;
        foreach (string match in matches)
        {
            Console.WriteLine(match);
        }
Marc Gravell
+1  A: 

Inferring that you want:

file:

foo
foo
foo
bar

text box:

foo
foo
bar
bar

to result in '3' (2 foos and one bar)

Dictionary<string,int> fileCounts = new Dictionary<string, int>();
using (var sr = new StreamReader("c:\\test.txt",Encoding.Default))
{
    foreach (var word in sr.ReadLine().ToLower().Split(' '))
    {
        int c = 0;
        if (fileCounts.TryGetValue(word, out c))
        {
            fileCounts[word] = c + 1;
        }
        else
        {
            fileCounts.Add(word, 1);
        }     
    }
}
int total = 0;
foreach (var word in richTextBox1.Text.ToLower().Split(' '))
{
    int c = 0;
    if (fileCounts.TryGetValue(word, out c))
    {
        total++;
        if (c - 1 > 0)
           fileCounts[word] = c - 1;       
        else
            fileCounts.Remove(word);
    }
}
MessageBox.Show(total.ToString());

Note that this is destructively modifying the read dictionary, you can avoid this (so only have to read the dictionary once) buy simply counting the rich text box in the same way and then taking the Min of the individual counts and summing them.

ShuggyCoUk
Ah this is the same as Marc's solution, page refresh for the win
ShuggyCoUk
thanx man thats what i want ..
Ibrahim AKGUN