views:

155

answers:

2

I'm curious if there's a simpler way to remove a particular parameter from a url. What I came up with is the following. This seems a bit verbose. Libraries to use or a more pythonic version appreciated.

parsed = urlparse(url)
if parsed.query != "":
    params = dict([s.split("=") for s in parsed.query.split("&")])
    if params.get("page"):
        del params["page"]
    url = urlunparse((parsed.scheme,
                      None,
                      parsed.path,
                      None,
                      urlencode(params.items()),
                      parsed.fragment,))
    parsed = urlparse(url)
+6  A: 

Use urllib2.parse_qsl() to crack the query string. You can filter this in one go:

params = [(k,v) for (k,v) in parse_qsl(parsed.query) if k != 'page']
Marcelo Cantos
+1. Beautiful Python.
Xavier Ho
The url manipulation here seems tortured even with your minor change.
dnolen
@dnolen: I agree. Python's baked-in libraries aren't particularly good for simple URI manipulation. (Did you downvote me? If so, it hardly seems reasonable to downvote someone because of limitations in the language or its libraries.)
Marcelo Cantos
@Marcelo, I downvoted because your answer didn't attempt to address the actual problem, and some people seemed to think otherwise. In any case at 14.3k karma it hardly affects you.
dnolen
The rep isn't the issue here; what irks me is that you think I wasn't trying to help solve your problem. The title question was: "Is there a better way to write this URL Manipulation in Python?" My answer offers precisely that.
Marcelo Cantos
+3  A: 

I've created a small helper class to represent a url in a structured way:

import cgi, urllib, urlparse

class Url(object):
    def __init__(self, url):
        """Construct from a string."""
        self.scheme, self.netloc, self.path, self.params, self.query, self.fragment = urlparse.urlparse(url)
        self.args = dict(cgi.parse_qsl(self.query))

    def __str__(self):
        """Turn back into a URL."""
        self.query = urllib.urlencode(self.args)
        return urlparse.urlunparse((self.scheme, self.netloc, self.path, self.params, self.query, self.fragment))

Then you can do:

u = Url(url)
del u.args['page']
url = str(u)

More about this: Web development peeve.

Ned Batchelder
A reasonable compromise. Far more useful than urlparse I would say ;)
dnolen