ansaurus

Question

Answer 1

+16 A:

You could do this:

url = 'abcdc.com'
if url.endswith('.com'):
    url = url[:-4]

Or using regular expressions:

import re
url = 'abcdc.com'
url = re.sub('\.com$', '', url)

Steef 2009-06-24 14:47:41

You beat me to it with a better answer... +1

Daren Thomas 2009-06-24 14:49:02

Which would be better??re.sub('\.com$', '', url)url.rsplit('.com', 1)[0]Or are both, different ways to solve the problem..

Ramya 2009-06-24 15:01:37

I'd vote for the non regex method

Dominic Rodger 2009-06-24 15:05:14

Yeah, I myself think that the first example, with the endswith() test, would be the better one; the regex one would involve some performance penalty (parsing the regex, etc.). I wouldn't go with the rsplit() one, but that's because I don't know what you're exactly trying to achieve. I figure it's removing the .com if and only if it appears at the end of the url? The rsplit solution would give you trouble if you'd use it on domain names like 'www.commercialthingie.co.uk'

Steef 2009-06-24 15:26:38

Answer 2

+1 A:

How about url[:-4]?

Daren Thomas 2009-06-24 14:48:21

Answer 3

+9 A:

strip strips the characters given from both ends of the string, in your case it strips ".", "c", "o" and "m".

truppo 2009-06-24 14:48:49

It will also remove those characters from the front of the string. If you just want it to remove from the end, use rstrip()

Andre Miller 2009-06-24 14:53:10

Answer 4

A:

This is a perfect use for regular expressions:

>>> import re
>>> re.match(r"(.*)\.com", "hello.com").group(1)
'hello'

Aaron Maenpaa 2009-06-24 14:53:03

You should also add a $ to make sure that you're matching hostnames *ending* in ".com".

Cristian Ciupitu 2009-06-24 14:56:44

Answer 5

+1 A:

If you know it's an extension, then

  url = 'abcdc.com'
  ...
  url.split('.')[0]

This works equally well with abcdc.com as abcdc.[anything] and is more extensible.

JohnMetta 2009-06-24 14:57:24

You need to be careful with this, because if the supplied url changes to "www.abcdc.com", url.split('.')[0] is just the "www".

Neil 2009-06-24 15:45:50

I didn't feel the need to include any sort of error checking in this snippit, but that's a very excellent point-- especially given my comment of extensiblity.

JohnMetta 2009-06-24 15:55:27

Answer 6

A:

I don't see anything wrong with the way you're doing it with rsplit, it does exactly what you want. It all depends on how generic you want the solution to be. Do you always want to remove .com, or will it sometimes be .org? If that is the case, use one of the other solutions, otherwise, stick with rsplit()

The reason that strip() does not work the way you expect is that it works on each character individually. It will scan through your string and remove all occurrences of the characters from the end AND the front. So if your string started with 'c', that would also be gone. You would use rstrip to only strip from the back.

Andre Miller 2009-06-24 14:58:14

Answer 7

A:

Depends on what you know about your url and exactly what you're tryinh to do. If you know that it will always end in '.com' (or '.net' or '.org') then

 url=url[:-4]

is the quickest solution. If it's a more general URLs then you're probably better of looking into the urlparse library that comes with python.

If you on the other hand you simply want to remove everything after the final '.' in a string then

url.rsplit('.',1)[0]

will work. Or if you want just want everything up to the first '.' then try

url.split('.',1)[0]

2009-06-24 14:59:35

Answer 8

+1 A:

Another one

url = '.'.join(url.split('.')[0:-1])

Nick D 2009-06-24 15:00:36

If the first index of a slice is 0 you can leave it out and it will be implied. So your example could have [:-1] instead of [0:-1]. (Just a pet peeve of mine, like when people say range(0, 10) instead of range(10)).

Kiv 2009-06-24 15:06:19

You are right, but in this case, if the url has no dot, it will not work.

Nick D 2009-06-24 15:24:08

Answer 9

+3 A:

def strip_end(text, suffix):
  if not text.endswith(suffix):
    return text
  return text[:-len(suffix)]

yairchu 2009-06-24 15:13:09

Answer 10

A:

Actually the simplest way would be to use 'replace':

url = 'abcdc.com'
print url.replace('.com','')

Charles Collis 2010-03-06 15:41:45

that will also replace url like `www.computerhope.com`. do a check with `endswith()` and should be fine.

ghostdog74 2010-03-07 00:26:56

ansaurus

tags:

views:

answers:

Python strip a string..

related questions