views:

356

answers:

5

String a="(Yeahhhh) I have finally made it to the (top)";

Given above String, there are 4 of '(' and ')' altogether.

My idea of counting that is by utilizing String.charAt method. However, this method is rather slow as I have to perform this counting for each string for at least 10000 times due to the nature of my project.

Anyone has any better idea or suggestion than using .chartAt method?????

Sorry for not explaining clearly earlier on, what I meant for the 10000 times is for the 10000 sentences to be analyzed which is the above String a as only one sentence.

+5  A: 

Sounds like homework, so I'll try to keep it at the "nudge in the right direction".

What if you removed all characters NOT the character you are looking for, and look at the length of that string?

There is a String method that will help you with this.

MarkPowell
wowww....I guess that is one of the best solutions...Really thanks for the idea!
Mr CooL
Btw, can you tell me which method it is? I'm using .replaceAll method which is to replace the character I'm looking and compare back to the original one.I'm just curious if there is a method to actually replace everything but a character.
Mr CooL
Could you please explain how creating a new string without `n` occurrences of a character should be faster than finding `n`?
sfussenegger
@sfussenegger: Of course it wouldn't be faster to *execute*, but typing in the source code would be a bit quicker. Isn't that what the OP meant? :P
Alan Moore
@alan hmmm ... maybe if he writes a method for each of his 10,000 strings ;)
sfussenegger
+2  A: 

You can use toCharArray() once and iterate over that. It might be faster.

Why do you need to do this 10000 times per String? Why don't you simply remember the result of the first time? This would save a lot more than speeding up a single counting.

Joachim Sauer
I was just about to say that. I too am not sure if working with a char array would show a noticeable difference in speed.Obviously, depending on where the strings come from and their content, other techniques might or might not provide orders of magnitude faster performance.
Tomislav Nakic-Alfirevic
Oh..Sorry..What I meant is it will have about 10000 sentences to be analyzed over..Thanks for the suggestion anyway. ^^
Mr CooL
It might be faster to use toCharArray() since charAt() contains a bounds check that might be more easily optimised in a straight forward array iteration. Of course, toCharArray() has to do an array copy which charAt() avoids. So it really depends on the exact situation you are in (and how the JVM decides to optimise the code at runtime). You would probably have to try both strategies and make measurements in your target environment.
flamingpenguin
+4  A: 

StringUtils.countMatches(wholeString, searchedString) (from commons-lang)

searchedString may be one-char - "("

It (as noted in the comments) is calling charAt(..) multiple times. However, what is the complexity? Well, its O(n) - charAt(..) has complexity O(1), so I don't understand why do you find it slow.

Bozho
Wow...thanks for sharing that....^^
Mr CooL
As far as I can tell, internally that function calls String.indexOf multiple times (each time starting from the end of the previous match). I can't really see how that can be faster (if performance is what is being asked about) than just iterating the characters in the string (since we know that the thing we are looking for is one character in length). Dunno hwo much slower it might be of course, would have to measure it...
flamingpenguin
Hey Bozho, really appreciate for the suggestion. That's a very kind of you. It helps a lot for this referenced library.Because my project deals a lot with web content mining and text mining. This library helps reducing a lot of problems and tediousness as well. ^^
Mr CooL
A: 

You can achieve this by following method.

This method would return a map with key as the character and value as its occurence in input string.

Map countMap = new HashMap();

public void updateCountMap(String inStr, Map<Character, Integer> countMap)
    {
        char[] chars =  inStr.toCharArray();
        for(int i=0;i<chars.length;i++)
        {
            if(!countMap.containsKey(chars[i]))
            {
                 countMap.put(chars[i], 1);
            }
            countMap.put(chars[i] ,countMap.get(chars[i])+1);
        }
        return countMap;        
    }

What we can do is read the file line by line and calling the above method for every line. Each time the map would keep adding the values(number of occurences) for characters. Thus, the Character array size would never be too long and we achieve what we need.

Advantage: Single iteration over the input string's characters. Character array size never grows to high limits. Result map contains occurences for each character.

Cheers

Mohd Farid
+1  A: 

You could do that with Regular Expressions:

Pattern pattern = Pattern.compile("[\\(\\)]"); //Pattern says either '(' or ')'
Matcher matcher = pattern.matcher("(Yeahhhh) I have finally made it to the (top)");
int count = 0;
while (matcher.find()) { //call find until nothing is found anymore
  count++;
}
System.out.println("count "+count);

The Pro is, that the Patterns are very flexible. You could also search for embraced words: "\\(\\w+\\)" (A '(' followed by one or more word characters, followed by ')')

The Con is, that it may be like breaking a fly on the wheel for very simple cases

See the Javadoc of Pattern for more details on Regular Expressions

Hardcoded