Hello, I would like to count the occurrences of a character in a string, suppose I have the string "aaaab", how would i count the amount of a's in it?
A simple loop over the characters would do it.
public int countChars(char c, String s) {
int result = 0;
for (int i = 0, n = s.length(); i < n; i++) {
if (s.charAt(i) == c) {
result++;
}
}
return result;
}
The code looks way easier to read if you don't use regular expressions.
int count = 0;
for(int i =0; i < string.length(); i++)
if(string.charAt(i) == 'a')
count++;
count
now contains the number of 'a's in your string. And, this performs in optimal time.
Regular expressions are nice for pattern matching. But just a regular loop will get the job done here.
Try using Apache Commons' StringUtils:
int count = StringUtils.countMatches("aaaab", "a");
// count = 4
int count = 0;
for (char c : string.toCharArray())
if (c == 'a')
count++;
For your String s
and character c
, try this:
int occurences = 0;
int index = s.indexOf(c, 0);
while (index != -1) {
occurences++;
index = s.indexOf(c, index);
}
Guava's CharMatcher API is quite powerful and concise:
CharMatcher.is('a').countIn("aaaab"); //returns 4
Here is a really short solution without any extra libraries:
String input = "aaaab";
int i = -1, count = 0;
while( (i = input.indexOf( 'a', i + 1 ) ) != -1 ) count++;
System.out.println( count );
Regular expressions aren't particularly good at counting simple things. Think ant+sledgehammer. They are good at busting complex strings up into pieces.
Anyway, here's one solution the OP is interested in - using a Regexp to count a's:
public class Reggie {
public static void main(String[] args) {
Pattern pattern = Pattern.compile("[^a]*a");
Matcher matcher = pattern.matcher("aaabbbaaabbabababaaabbbbba");
int count = 0;
while(matcher.find()) {
count++;
}
System.out.println(count+" matches");
}
}
This is a pretty slow way to do it, as pointed out by others. Worse, it isn't the easiest and certainly isn't the most likely to be bug-free. Be that as it may, if you wanted something a little more complex than 'a' then the regexp would become more appropriate as the requested string got more complex. For example, if you wanted to pick dollar amounts out of a long string then a regexp could be the best answer.
Now, about the regexp: [^a]*a
This [^a]* means 'match zero or more non-'a' characters. This allows us to devour non-a crud form the beginning of a string: If the input is 'bbba' then [^a]* will match 'bbb'. It doesn't match the 'a'. Not to worry, the trailing 'a' in the Regexp says, "match exactly one a'. So our regexp says, "match zero or more non-a characters that are followed by an a."
Ok. Now you can read about Pattern and Matcher. The nutshell is that the Pattern is a compiled (read: efficient) regular expression. It is expensive to compile a Regexp so I make mine static so they only get compiled once. The Matcher is a class that will apply a string to a Pattern to see if it matches. Matcher has state information that lets it crawl down a string applying a Pattern repeatedly.
The loop basically says, "matcher, crawl down the string finding me the next occurrence of the pattern. If we find it, increment the counter." Note the character sequences being found by Matcher isn't just 'a'. It is finding sequences like the following: a, bbba, bba, ba, etc. That is, strings that don't contain an 'a' except for their last character.