tags:

views:

48

answers:

4

What would be the most efficient way to cover all cases for a retrieve of folder1/folder22

from:

http://localhost:8080/folder1/folder22/file.jpg
or
http://domain.com/folder1/folder22/file.jpg
or
http://127.0.0.0.1:8080/folder1/folder22/file.jpg

so there may be one or more folders/sub-folders. Basically I would like to strip the domain name and port if available and the file name at the end.

Thank for your time.

+3  A: 

What about the URL class and getPath()?

Maybe it's not the most efficient way, but one of the simplest I think:

String[] urls = { 
  "http://localhost:8080/folder1/folder22/file.jpg", 
  "http://domain.com/folder1/folder22/file.jpg",
  "http://127.0.0.0.1:8080/folder1/folder22/file.jpg" };
for (String url : urls)
  System.out.println(new File(new URL(url).getPath()).getParent());
splash
Nice one splash
Adnan
I would prefer this over a regex solution because a) it's easier to understand because it's more explicit, b) it's the right tool for the job, and c) it probably handles edge cases a lot nicer than a regex.
Tim Pietzcker
+1  A: 

You should probably use Java's URL parser for this, but if it has to be a regex:

\b(?=/).*(?=/[^/\r\n]*)

will match /folder1/folder22 in all your examples.

try {
    Pattern regex = Pattern.compile("\\b(?=/).*(?=/[^/\r\n]*)");
    Matcher regexMatcher = regex.matcher(subjectString);
    if (regexMatcher.find()) {
        ResultString = regexMatcher.group();
    } 

Explanation:

\b: Assert position at a word boundary (this will work before a single slash, but not between slashes or after a :)

(?=/): Assert that the next character is a slash.

.*: Match anything until...

(?=/[^/\r\n]*): ...exactly one last / (and anything else except slashes or newlines) follows.

Tim Pietzcker
Thanx Tim Pietzcker, just what I needed.
Adnan
A: 
^.+/([^/]+/[^/]+)/[^/]+$
Alan Haggai Alavi
This only works for exactly two folders, but any number (>=1) is possible according to the question.
Tim Pietzcker
Ah! Did not notice that. Thanks for pointing it out.
Alan Haggai Alavi
A: 

The best way to get the last two directories from a url is the following:

preg_match("/\/([^\/]+\/){2}[^\/]+$/", $path, $matches);

If matched, And $matches[1] will always contain what you want, no matter filename of full url.

flaab