views:

2799

answers:

5

I use RegexBuddy while working with regular expressions. From its library i copied the regular expression to match urls. I tested succesfully within regexbuddy. However, when I copied it as Javas String flavor and pasted it into java code it does not work. The next class prints false:

public class RegexFoo {

    public static void main(String[] args) {
        String regex = "\\b(https?|ftp|file)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]";
        String text = "http://google.com";
        System.out.println(IsMatch(text,regex));
}

    private static boolean IsMatch(String s, String pattern) {
        try {
            Pattern patt = Pattern.compile(pattern);
            Matcher matcher = patt.matcher(s);
            return matcher.matches();
        } catch (RuntimeException e) {
        return false;
    }     
}   
}

Does anyone what i am doing wrong?

+6  A: 

Try the following regex string instead. Your test was probably done in a case insensitive manner. I have added the lowercase alphas as well as a proper string beginning placeholder.

String regex = "^(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";

This works too:

String regex = "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";

Note:

String regex = "<\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]>"; // matches <http://google.com&gt;

String regex = "<^(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]>"; // does not match <http://google.com&gt;
TomC
Using your regular expression i get false too.
Sergio del Amo
Did you catch my last edit. I fat fingered the beginning of the string. I just copied it into Eclipse and I get "true".
TomC
thanks man, first time i see utility to the comments in stackoverflow
Sergio del Amo
No problem. If you're using Eclipse I like using the RegEx Tester plugin available here http://www.brosinski.com/regex/
TomC
thansk for the link i am using eclipse
Sergio del Amo
I can not edit your answer. Maybe you could edit yourself and combine my last answer with yours
Sergio del Amo
I have added your notes.
TomC
+3  A: 

I'll try a standard "Why are you doing it this way?" answer... Do you know about java.net.URL?

URL url = new URL( stringURL );

The above will throw a MalformedURLException if it can't parse the URL.

Bill James
I have to go through the regular expressions road. What i post here is as simple as possible to make my question clear. In my program I am using the URL regex inside a more complex regex.
Sergio del Amo
That's cool. I didn't have a better answer regex-wise, so I thought I'd post an alternative. Didn't think I'd get down-ticked for it, though.
Bill James
you are right, maybe down-ticked was a bit two much. The "I'll try the standard" just sounded a bit offensive.
Sergio del Amo
cool (sorry, quick vacation). Ya, definitely wasn't intended that way. I just see that a lot here, and sometimes it even helps.
Bill James
A: 

This works too:

String regex = "\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]";

Note:

String regex = "<\\b(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]>"; // matches <http://google.com&gt;

String regex = "<^(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]>"; // does not match <http://google.com&gt;

So probably the first one is more useful for general use.

Sergio del Amo
A: 

Or you could use

Pattern patt = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);

to avoid changing the regex to match both uppercase and lowercase.

jm4
A: 

When using regular expressions from RegexBuddy's library, make sure to use the same matching modes in your own code as the regex from the library. If you generate a source code snippet on the Use tab, RegexBuddy will automatically set the correct matching options in the source code snippet. If you copy/paste the regex, you have to do that yourself.

In this case, as others pointed out, you missed the case insensitivity option.

Jan Goyvaerts