views:

8894

answers:

10

Hi,

url = 'abcdc.com'
print url.strip('.com')
Expect: abcdc
Resut: abcd

Now I do url.rsplit('.com', 1)

Is there a better way..

+16  A: 

You could do this:

url = 'abcdc.com'
if url.endswith('.com'):
    url = url[:-4]

Or using regular expressions:

import re
url = 'abcdc.com'
url = re.sub('\.com$', '', url)
Steef
You beat me to it with a better answer... +1
Daren Thomas
Which would be better??re.sub('\.com$', '', url)url.rsplit('.com', 1)[0]Or are both, different ways to solve the problem..
Ramya
I'd vote for the non regex method
Dominic Rodger
Yeah, I myself think that the first example, with the endswith() test, would be the better one; the regex one would involve some performance penalty (parsing the regex, etc.). I wouldn't go with the rsplit() one, but that's because I don't know what you're exactly trying to achieve. I figure it's removing the .com if and only if it appears at the end of the url? The rsplit solution would give you trouble if you'd use it on domain names like 'www.commercialthingie.co.uk'
Steef
+1  A: 

How about url[:-4]?

Daren Thomas
+9  A: 

strip strips the characters given from both ends of the string, in your case it strips ".", "c", "o" and "m".

truppo
It will also remove those characters from the front of the string. If you just want it to remove from the end, use rstrip()
Andre Miller
A: 

This is a perfect use for regular expressions:

>>> import re
>>> re.match(r"(.*)\.com", "hello.com").group(1)
'hello'
Aaron Maenpaa
You should also add a $ to make sure that you're matching hostnames *ending* in ".com".
Cristian Ciupitu
+1  A: 

If you know it's an extension, then

  url = 'abcdc.com'
  ...
  url.split('.')[0]

This works equally well with abcdc.com as abcdc.[anything] and is more extensible.

JohnMetta
You need to be careful with this, because if the supplied url changes to "www.abcdc.com", url.split('.')[0] is just the "www".
Neil
I didn't feel the need to include any sort of error checking in this snippit, but that's a very excellent point-- especially given my comment of extensiblity.
JohnMetta
A: 

I don't see anything wrong with the way you're doing it with rsplit, it does exactly what you want. It all depends on how generic you want the solution to be. Do you always want to remove .com, or will it sometimes be .org? If that is the case, use one of the other solutions, otherwise, stick with rsplit()

The reason that strip() does not work the way you expect is that it works on each character individually. It will scan through your string and remove all occurrences of the characters from the end AND the front. So if your string started with 'c', that would also be gone. You would use rstrip to only strip from the back.

Andre Miller
A: 

Depends on what you know about your url and exactly what you're tryinh to do. If you know that it will always end in '.com' (or '.net' or '.org') then

 url=url[:-4]

is the quickest solution. If it's a more general URLs then you're probably better of looking into the urlparse library that comes with python.

If you on the other hand you simply want to remove everything after the final '.' in a string then

url.rsplit('.',1)[0]

will work. Or if you want just want everything up to the first '.' then try

url.split('.',1)[0]
+1  A: 

Another one

url = '.'.join(url.split('.')[0:-1])
Nick D
If the first index of a slice is 0 you can leave it out and it will be implied. So your example could have [:-1] instead of [0:-1]. (Just a pet peeve of mine, like when people say range(0, 10) instead of range(10)).
Kiv
You are right, but in this case, if the url has no dot, it will not work.
Nick D
+3  A: 
def strip_end(text, suffix):
  if not text.endswith(suffix):
    return text
  return text[:-len(suffix)]
yairchu
A: 

Actually the simplest way would be to use 'replace':

url = 'abcdc.com'
print url.replace('.com','')
Charles Collis
that will also replace url like `www.computerhope.com`. do a check with `endswith()` and should be fine.
ghostdog74