ansaurus

Question

python: remove substring only at the end of string

Answer 1

+6 A:

def rchop(thestring, ending):
  if thestring.endswith(ending):
    return thestring[:-len(ending)]
  return thestring

somestring = rchop(somestring, ' rec')

Jack Kelly 2010-09-07 23:34:32

good stuff; just watch out for shadowing the built-in `str`

Adam Bernier 2010-09-07 23:47:31

Noted and edited. Thank you.

Jack Kelly 2010-09-07 23:49:18

@Jack, `string` is the name of a standard library module, that may _also_ a bad idea to nameclash with, no less than a builtin...!-) Rather, I'd recommend you try getting used to employing identifiers such as `thestring`, `astring`, and the like, instead!-).

Alex Martelli 2010-09-08 00:20:51

And this is why we don't make hasty edits, people.

Jack Kelly 2010-09-08 00:24:34

Is `endswith` a carry over from early implementations of `str`? It seems kind of redundant. `def endswith(s, e): return s[-len(e):] == e`

Matt Joiner 2010-09-12 08:10:30

@Matt Joiner: I don't know. I'd suspect a convenience alias, given that words are easier for most people than reading slice notation.

Jack Kelly 2010-09-12 09:43:32

Answer 2

+1 A:

You could use a regular expression as well:

from re import sub

str = r"this is some string rec"
regex = r"(.*)\srec$"
print sub(regex, r"\1", str)

Andrew Hare 2010-09-07 23:35:59

Capturing groups are be overkill here. `sub(' rec$', '', str)` works.

Jack Kelly 2010-09-07 23:39:55

Answer 3

+6 A:

Since you have to get len(trailing) anyway (where trailing is the string you want to remove IF it's trailing), I'd recommend avoiding the slight duplication of work that .endswith would cause in this case. Of course, the proof of the code is in the timing, so, let's do some measurement (naming the functions after the respondents proposing them):

import re

astring = 'this is some string rec'
trailing = ' rec'

def andrew(astring=astring, trailing=trailing):
    regex = r'(.*)%s$' % re.escape(trailing)
    return re.sub(regex, r'\1', astring)

def jack0(astring=astring, trailing=trailing):
    if astring.endswith(trailing):
        return astring[:-len(trailing)]
    return astring

def jack1(astring=astring, trailing=trailing):
    regex = r'%s$' % re.escape(trailing)
    return re.sub(regex, '', astring)

def alex(astring=astring, trailing=trailing):
    thelen = len(trailing)
    if astring[-thelen:] == trailing:
        return astring[:-thelen]
    return astring

Say we've named this python file a.py and it's in the current directory; now, ...:

$ python2.6 -mtimeit -s'import a' 'a.andrew()'
100000 loops, best of 3: 19 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack0()'
1000000 loops, best of 3: 0.564 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack1()'
100000 loops, best of 3: 9.83 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.alex()'
1000000 loops, best of 3: 0.479 usec per loop

As you see, the RE-based solutions are "hopelessly outclassed" (as often happens when one "overkills" a problem -- possibly one of the reasons REs have such a bad rep in the Python community!-), though the suggestion in @Jack's comment is way better than @Andrew's original. The string-based solutions, as expected, shing, with my endswith-avoiding one having a miniscule advantage over @Jack's (being just 15% faster). So, both pure-string ideas are good (as well as both being concise and clear) -- I prefer my variant a little bit only because I am, by character, a frugal (some might say, stingy;-) person... "waste not, want not"!-)

Alex Martelli 2010-09-08 00:36:21

what do you have a space in the import a' 'a.xxx ?

Blankman 2010-09-08 15:49:45

@Blankman, it's a bash command running Python: the setup (`-s`) is one argument, the code being timed the other. Each is quoted so I don't have to worry about it including spaces and/or special character, os course. You always separate arguments with spaces in bash (and most other shells, including Windows' own cmd.exe, so I'm pretty surprised at your question!), and quoting arguments to a shell command to preserve spaces and special characters within each argument is also definitely not what I would call a peculiar, rare, or advanced usage of any shell...!-)

Alex Martelli 2010-09-08 17:30:12

Oh I see you've bypassed `endswith` as I mentioned in Jack's answer. Caching the len also avoids Python's (and C's!) terrible call overhead.

Matt Joiner 2010-09-12 08:12:19

Answer 4

A:

As kind of one liner generator joined:

test = """somestring='this is some string rec'
this is some string in the end word rec
This has not the word."""
match = 'rec'
print('\n'.join((line[:-len(match)] if line.endswith(match) else line)
      for line in test.splitlines()))
""" Output:
somestring='this is some string rec'
this is some string in the end word 
This has not the word.
"""

Tony Veijalainen 2010-09-08 07:46:14

ansaurus

tags:

views:

answers:

python: remove substring only at the end of string

related questions