tags:

views:

91

answers:

4

I'm having trouble finding a regular expression that matches the following String.

Korben;http://feeds.feedburner.com/KorbensBlog-UpgradeYourMind?format=xml;1

One problem is escaping the question mark. Java's pattern matcher doesn't seem to accept \? as a valid escape sequence but it also fails to work with the tester at myregexp.com.

Here's what I have so far:

([a-zA-Z0-9])+;http://([a-zA-Z0-9./-]+);[0-9]+

Any suggestions?

Edit: The original intent was to match all URLs that could be found after the first semi colon.

+3  A: 

If you are putting the expression in a string, you need to escape the "\" as well. That is:

String expr = "([a-zA-Z0-9])+;http://([a-zA-Z0-9./\\-\\?]+);[0-9]+";

You also need to escape the "-" if it's not the last character in a character class ([...]) construct.

Dean Harding
You can also put the - at the beginning of the character class.
Kibbee
Thanks codeka and everyone else that replied. After some testing, the following should match all URLs: ([a-zA-Z0-9])+;http://([a-zA-Z0-9./\\-\\?=~]+);[0-9]+
James P.
You want that first plus sign *inside* the parentheses: `([a-zA-Z0-9]+)`, not `([a-zA-Z0-9])+`. Also, as @DVK pointed out, you don't need to escape the question mark inside a character class; `[a-zA-Z0-9./?-]` works just fine.
Alan Moore
+1  A: 

[?] matches "?"

DVK
I'd be tempted to use that in future instead of messing around with backslashes ;)
James P.
+1  A: 

Maybe you need to escape your backslash, if your expression is in a string. Something like "\\?"

ongle
+1  A: 
([a-zA-Z0-9]+);http://([a-zA-Z0-9./-]+)(\?[^;]+);([0-9]+)

Works for me on that RexExp Editor website.

poke