views:

84

answers:

3

I want to get "the-game" using regex from URLs like

  • http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/
  • http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/
  • http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/
+1  A: 
var myregexp = /^(?:[^\/]*\/){4}([^\/]+)/;
var match = myregexp.exec(subject);
if (match != null) {
    result = match[1];
} else {
    result = "";
}

matches whatever lies between the fourth and fifth slash and stores the result in the variable result.

Tim Pietzcker
cute... I was thinking that, but I didn't write it as an answer
xyld
Reading from left side I am just looking for whatever text between 4th and 5th slash (/).
Ah, you beat me on the update! Amazing how far a little clarification of requirements goes :)
BenV
+1  A: 

What parts of the URL could vary and what parts are constant? The following regex will always match whatever is in the slashes following "/en/" - the-game in your example.

(?<=/en/).*?(?=/)

This one will match the contents of the 2nd set of slashes of any URL containing "webdev", assuming the first set of slashes contains a 2 or 3 character language code.

(?<=.*?webdev.*?/.{2,3}/).*?(?=/)

Hopefully you can tweak these examples to accomplish what you're looking for.

BenV
Reading from left side I am just looking for whatever text between 4th and 5th slash (/).
A: 

You probably should use some kind of url parsing library rather than resorting to using regex.

In python:

from urlparse import urlparse
url = urlparse('http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/')
print url.path

Which would yield:

/en/the-game/another-one/another-one/another-one/

From there, you can do simple things like stripping /en/ from the beginning of the path. Otherwise, you're bound to do something wrong with a regular expression. Don't reinvent the wheel!

xyld