tags:

views:

128

answers:

4

Hi folks!

I'm trying to find the latter half of patterns of the following template:

foo/BAR 

'BAR' is the one I'm trying to retrieve. I tried with something like:

\b(foo)/([a-zA-Z]+)

This works fine, but this also matches http://foo/BAR - which I don't want.

I also tried

\\s(foo)/([a-zA-Z]+)

but this doesnt match when the line starts with foo/BAR. (I'm using java.util.regex)

I'm a regex newbie, any help here is much appreciated!

Thanks, Raj

+1  A: 

How about

^(foo)/([a-zA-Z]+)

or

(?<!http://)(foo)/([a-zA-Z]+)
S.Mark
Are you suggesting two different regexes and ORing them? For some restriction, my regex is placed in an external file, and I can specify only one combined regex
Raj
its two different regexes.
S.Mark
+1  A: 

\b is a word boundary, ^ is a start of line marker

^foo/(\w+)
enbuyukfener
start of line is not the only place the author expects foo/BAR to be
Antony Hatchkins
\w includes national characters, which is supposedly not what the author expects
Antony Hatchkins
Regarding where he wants to match it, I might have misread the question. When he said he doesn't want to match `http://foo/BAR`, I couldn't come to another conclusion besides not wanting it at start of line.Regarding \w, it was a suggestion (in hindsight, I should have mentioned it), he seems to have the know how to revert it if need be.I'm thinking you and Jordan understood it right by using this at the start: `(?:^|\s)`
enbuyukfener
+4  A: 
(^|\s)foo/([a-zA-Z]+)
Antony Hatchkins
maybe `(?:^|\s)foo/(a-zA-Z)+`? `\b` isn't working well for the OP here.
Kobi
`\b` isn't working for OP only in the beginning of line.
Antony Hatchkins
`\b` allows special characters before foo, as in `http://foo/BAR`. Space, which was your original choice, seems right here. Also, the start of the line is matched by `\b`, so `^|` is redundant.
Kobi
aha, fixed that
Antony Hatchkins
This demonstrates nicely that that you can, mostly, use logical operators with metacharacters and character escapes. That allows for much more possibilities with regexes.
Confusion
(a-zA-Z) should really be [a-zA-Z]
Jordan Stewart
Antony, this is very close to what I want. Hence accepting this. Thanks!
Raj
Jordan Stewart: thanks, fixed
Antony Hatchkins
+3  A: 

If you define a full "foo/BAR" token as both preceeded and followed by whitespace (or begin/end of the line)

I.e. it would find "abc", "XyZ", and "def" in

"foo/abc 123 hhh foo/XyZ http://foo/BAR foo foo/ foo/ghi% foo/def"

then you want

(?:^|\s)foo/([a-zA-Z]+)(?:$|\s)
Jordan Stewart