ansaurus

Question

Python Find Question

Answer 1

A:

You can remove the slash at the end of your string before processing it:

if url[-1] == '/':
    url = url[:-1]

unbeknown 2008-10-23 11:23:48

Answer 2

A:

You could use

print url[url.rstrip("/").rfind("/") +1 : ]

Tim Pietzcker 2008-10-23 11:28:54

Answer 3

+1 A:

Filenames with a slash at the end are technically still path definitions and indicate that the index file is to be read. If you actually have one that' ends in test.php/, I would consider that an error. In any case, you can strip the / from the end before running your code as follows:

url = url.rstrip('/')

Steve Moyer 2008-10-23 11:31:12

Deestan 2008-10-23 15:25:35

Actually it will ... they both resolve to the same path and are redirected to http://www.reddit.com/r/gaming/. As was pointed out elsewhere, query strings are a completely different problem (which the OP didn't ask about)

Steve Moyer 2008-10-23 21:41:35

Answer 4

A:

There is a library called urlparse that will parse the url for you, but still doesn't remove the / at the end so one of the above will be the best option

Andrew Cox 2008-10-23 11:32:14

Answer 5

+8 A:

Just removing the slash at the end won't work, as you can probably have a URL that looks like this:

http://www.google.com/test.php?filepath=tests/hey.xml

...in which case you'll get back "hey.xml". Instead of manually checking for this, you can use urlparse to get rid of the parameters, then do the check other people suggested:

from urlparse import urlparse
url = "http://www.google.com/test.php?something=heyharr/sir/a.txt"
f = urlparse(url)[2].rstrip("/")
print f[f.rfind("/")+1:]

Claudiu 2008-10-23 11:32:46

Answer 6

A:

Just for fun, you can use a Regexp:

import re
print re.search('/([^/]+)/?$', url).group(1)

gimel 2008-10-23 11:38:13

Python isn't Perl, you don't always need to be reaching for regexps! For simple processing the builtin string methods are likely to be more readable and faster. (In this case on my machine regexps were 60% slower, 160% if not pre-compiled. Not that it probably matters on such simple code, but still)

bobince 2008-10-23 11:56:49

I know. I also support the urlparse suggestion. Since no one brought up regexps, I thought I'd mention the possibility.

gimel 2008-10-23 13:12:21

Answer 7

+4 A:

Use [r]strip to remove trailing slashes:

url.rstrip('/').rsplit('/', 1)[-1]

If a wider range of possible URLs is possible, including URLs with ?queries, #anchors or without a path, do it properly with urlparse:

path= urlparse.urlparse(url).path
return path.rstrip('/').rsplit('/', 1)[-1] or '(root path)'

bobince 2008-10-23 11:42:52

+1 for urlparse plus rstrip solution.

S.Lott 2008-10-23 13:03:04

Answer 8

A:

filter(None, url.split('/'))[-1]

(But urlparse is probably more readable, even if more verbose.)

fivebells 2008-10-23 13:10:34

ansaurus

tags:

views:

answers:

Python Find Question

related questions