ansaurus

Question

Java regular expression to identify strings with more digits than non-digits

Answer 1

+12 A:

That's not a regular language, and thus it cannot be captured by a vanilla regex. It may be possible anyway, but it will almost certainly be easier not to use a regex:

public static boolean moreDigitsThanNonDigits(String s) {
    int diff = 0;
    for(int i = 0; i < s.length(); ++i) {
        if(Character.isDigit(s.charAt(i))) ++diff;
        else --diff;
    }
    return diff > 0;
}

Dave 2009-05-25 19:06:13

Answer 2

A:

I'm not sure that using regular expressions would be the best solution here.

therefromhere 2009-05-25 19:09:27

I do not insist on using regular expression, I need to identify those strings somehow.

2009-05-25 19:12:36

Answer 3

+9 A:

You won't be able to write a regexp that does this. But you already said you're using Java, why not mix in a little code?

public boolean moreDigitsThanNonDigits(String input) {
    String nonDigits = input.replace("[0-9]","");
    return input.length() > (nonDigits.length * 2);
}

waxwing 2009-05-25 19:10:22

Hi, Can you please clarify my doubt, by using java.util.regex pacakgae, will I be able to search for the any kind of pattern in the text files or in any kind of file format?

harigm 2010-02-21 03:07:37

Answer 4

A:

regex alone can't (since they don't count anything); but if you want to use them then just use two replacements: one that strips out all the digits and one that only keeps them. then compare string lengths of the results.

of course, i'd rather use Dave's answer.

Javier 2009-05-25 19:11:59

Hi, Can you please clarify my doubt, by using java.util.regex pacakgae, will I be able to search for the any kind of pattern in the text files or in any kind of file format?

harigm 2010-02-21 03:07:58

Since regular expressions are used for comparing the patterns in a string, Then my doubt whether google searches the patterns concept to search in all the files?

harigm 2010-02-21 03:09:15

Answer 5

+2 A:

Regular expressions are conceptually not able to preform such a task. They are equivalent to formal languages or (regular) automatons. They have no notion of memory (or a stack), so they cannot count the occurences of symbols. The next extension in terms of expressiveness are push-down automatons (or stack machines), which correspond to context free grammars. Before writing such a grammer for this task, using a method like the moreDigitsThanNonDigits above would be appropriate.

The MYYN 2009-05-25 20:00:44

Perl- (and Java-) style regular expressions are actually more powerful than regular languages, because of the "\number" syntax for backtracking on a captured group. They can recognize languages that are not regular. For example, the language of any string repeated twice (which is not regular, nor even context-free) can be recognized by "(.*)\1".

newacct 2009-05-25 21:09:43

Thanks for pointing this out! Your example would be "(.*)\1\1", right? But length comparisons are still not possible, I would assume.

The MYYN 2009-05-25 21:27:06

ansaurus

tags:

views:

answers:

Java regular expression to identify strings with more digits than non-digits

related questions