tags:

views:

64

answers:

3

I have a set of urls :

/products

/categories

/customers

Now say a customers is named john, and I want to help john to reach his own account page with a shorter url:

before : /customers/john
after  : /john

(suppose customer names are unique)

I'm trying to figure out a proper regex dispatcher so all customers can have this feature :

/marry
/james
/tony-the-red-beard

here is what I got now(in PHP) :

'/^\/([^(products|categories|admin)].+)$/' => /customers/$1

This doesn't seem to work. Anybody can help me?

A: 

You're trying to use a negated character class the wrong way. A negated character class says 'do not match the contained characters'. What you are wanting to say is 'do not match if this stuff I specified here exists'. To do that you have to a bit more creative. Probably need some negative lookbehind. I'm not 100% sure about php's regex engine but something similar to this should work.

/^\/(?<!(?:products|categories|admin))(.+)$/

so, negative lookbehind (?<! ... ) saying don't match the .+ if products or categories or admin precede it. Then that is in a non-capturing group (?: ... ).

Check out Regular Expression Advanced Syntax Reference for extra help.

Qberticus
Er, you don't need to go as fancy as lookbehind (and in fact, I don't think that will work, since you haven't consumed those characters yet), and negative assertions are always non-caputuring, because, well, they're negative; what would they capture?
Brian Campbell
Thanks to Brian's hint, here is a possible fix to your statement: /^\/(.+)(?<!\/(?:products|categories|admin))$/This says "match '/anything' that doesn't come after /products, /categories, /admin. Correct me if I'm wrong.
Shawn
@Qberticus: same effect but using assertion is more intuitive.
Shawn
I always include non-capturing groups regardless. It's explicit and is there in case something changes. Lookbehind isn't fancy (?! vs (?<! is semantics in regards to this regex. One says don't match if the / is followed by, the other says don't match if the .+ is preceded by. I wonder which one is faster with the php engine though. Probably lookahead.
Qberticus
+2  A: 

What you need here is a negative lookahead assertion. What you want to say is "I want to match any string of characters, except for these particular strings." An assertion in a regex can match against a string, but it doesn't consume any characters, allowing those character to be matched by the rest of your regex. You can specify a negative assertion by wrapping a pattern in (?! and ).

'/^\/(?!products|categories|admin)(.+)$/'

Note that you might want the following instead, if you don't allow customers names to include slashes:

'/^\/(?!products|categories|admin)([^/]+)$/'
Brian Campbell
+1  A: 

This is entirely the wrong way to go about solving the problem, but it is possible to express fixed negative lookaheads without using negative lookaheads. Extra spacing for clarity:

^ (
( $ | [^/] |
  / ( $ | [^pc] |
    p ( $ | [^r] |
      r ( $ | [^o] |
        o ( $ | [^d] |
          d ( $ | [^u] |
            u ( $ | [^c] |
              c ( $ | [^t] |
                t ( $ | [^s] ))))))) |
    c ( $ | [^au] |
      a ( $ | [^t] |
        t ( $ | [^e] |
          e ( $ | [^g] |
            g ( $ | [^o] |
              o ( $ | [^r] |
                r ( $ | [^i] |
                  i ( $ | [^e] |
                    e ( $ | [^s] )))))))) |
      u ( $ | [^s] |
        s ( $ | [^t] |
          t ( $ | [^o] |
            o ( $ | [^m] |
              m ( $ | [^e] |
                e ( $ | [^r] |
                  r ( $ | [^s] ))))))))))
) .* ) $
ephemient
Wow. I'm not sure if I should be impressed or disturbed. Yes, it is possible to do negative lookahead in regular expressions, even if there is no syntactic sugar for them, just as it is possible to do match a whole range of characters without character classes. I'm not sure why you would want to write `(a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z|A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z)` instead of `[a-zA-Z]`, though, and likewise I'm not sure why you'd want to avoid negative lookahead assertions `(?!)`.
Brian Campbell
nah .. I'm simply scared. I believe there's a better solution. Besides it's not he end of the world.
Shawn
Just for fun, really, since somebody else already posted the correct answer.
ephemient
@Brian Campbell: Believe it or not, there are regular expression flavors that do not support lookaheads. For those, this is the only approach. Looks scary, but if all else fails… +1
Tomalak