views:

564

answers:

4

Hello. Im stuck on writing Word occurrence in a string. I got some tip(in task notes) to use is compareToIgnoreCase. so I tried something like this:

splitwords = StringCont.split("\\s");
for(int i=0; i<splitwords.length; i++)
{
    if(splitwords[1].compareToIgnoreCase(splitwords[i]) == 0)
        splitcount++;
}

It is of course just what I can do and probably bad way. When I run the code, I get sometimes out of array exeption and sometimes it runs. What is missing is: go through all words and check them and skip the words which were already counted. I will be happy to get any help with this so can move along and understand how it can be coded. Thank you :)

Edit: It seems I did not explain the problem enough clearly, but I get nice answer about the map object which easily put together what I needed. I did not know about map. So yea, I was trying to find the number of times every given word is found in the string.

tangens: it should mean-take the first word(where first whitespace is) splitwords[1] and compare it to all other words in string splitwords[i] and if it is 0(equals), then count++.

Esko: there indeed are white spaces like in sentence. But I still got this exeption. I dont know why thru.

A: 

Are you looking for the occurrence of a certain word in sentence or the cumulated word count of the all the words in sentence? Based on your sample code I'd expect the former but your explanation makes me think of the latter.

In any case, something obvious: If there aren't any whitespaces in the input String, String#split() returns an array containing exactly one element - the original String itself. Since array indexes start from zero, that splitwords[1] is causing the ArrayIndexOutOfBoundsException because the only String available is in index [0].

Esko
A: 

You can use a Set and store each word into and at the end get the size of the set.

You will get the number of different word not the number of word of course.

Patrick
With `HashSet` this is true, however with `TreeSet`... :)
Esko
Sorry i didn`t get it : a TreeSet is a set so it will not have duplicate element, so the size will be the different word count. No ?
Patrick
A: 

If you want to count word occurence, you should use a HashMap<String, Integer>. There you can store a counter for each word found.

tangens
+2  A: 

Store the words and their counts in a Map.

String[] words = string.toLowerCase().split("\\s+");
Map<String, Integer> wordCounts = new HashMap<String, Integer>();

for (String word : words) {
    Integer count = wordCounts.get(word);
    if (count == null) {
        count = 0;
    }
    wordCounts.put(word, count + 1);
}

Note that I called toLowerCase() before split() as you seem want to have case insensitivity.

BalusC
or use apache-commons HashBag instead of the HashMap (+1)
Bozho