tags:

views:

157

answers:

4

i have a bunch of strings

some of them have ' rec'

i want to remove that only if those are the last 4 characters

so another words

somestring='this is some string rec'

i want it to be:

somestring='this is some string'

what is the python way to approach this?

+6  A: 
def rchop(thestring, ending):
  if thestring.endswith(ending):
    return thestring[:-len(ending)]
  return thestring

somestring = rchop(somestring, ' rec')
Jack Kelly
good stuff; just watch out for shadowing the built-in `str`
Adam Bernier
Noted and edited. Thank you.
Jack Kelly
@Jack, `string` is the name of a standard library module, that may _also_ a bad idea to nameclash with, no less than a builtin...!-) Rather, I'd recommend you try getting used to employing identifiers such as `thestring`, `astring`, and the like, instead!-).
Alex Martelli
And this is why we don't make hasty edits, people.
Jack Kelly
Is `endswith` a carry over from early implementations of `str`? It seems kind of redundant. `def endswith(s, e): return s[-len(e):] == e`
Matt Joiner
@Matt Joiner: I don't know. I'd suspect a convenience alias, given that words are easier for most people than reading slice notation.
Jack Kelly
+1  A: 

You could use a regular expression as well:

from re import sub

str = r"this is some string rec"
regex = r"(.*)\srec$"
print sub(regex, r"\1", str)
Andrew Hare
Capturing groups are be overkill here. `sub(' rec$', '', str)` works.
Jack Kelly
+6  A: 

Since you have to get len(trailing) anyway (where trailing is the string you want to remove IF it's trailing), I'd recommend avoiding the slight duplication of work that .endswith would cause in this case. Of course, the proof of the code is in the timing, so, let's do some measurement (naming the functions after the respondents proposing them):

import re

astring = 'this is some string rec'
trailing = ' rec'

def andrew(astring=astring, trailing=trailing):
    regex = r'(.*)%s$' % re.escape(trailing)
    return re.sub(regex, r'\1', astring)

def jack0(astring=astring, trailing=trailing):
    if astring.endswith(trailing):
        return astring[:-len(trailing)]
    return astring

def jack1(astring=astring, trailing=trailing):
    regex = r'%s$' % re.escape(trailing)
    return re.sub(regex, '', astring)

def alex(astring=astring, trailing=trailing):
    thelen = len(trailing)
    if astring[-thelen:] == trailing:
        return astring[:-thelen]
    return astring

Say we've named this python file a.py and it's in the current directory; now, ...:

$ python2.6 -mtimeit -s'import a' 'a.andrew()'
100000 loops, best of 3: 19 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack0()'
1000000 loops, best of 3: 0.564 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack1()'
100000 loops, best of 3: 9.83 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.alex()'
1000000 loops, best of 3: 0.479 usec per loop

As you see, the RE-based solutions are "hopelessly outclassed" (as often happens when one "overkills" a problem -- possibly one of the reasons REs have such a bad rep in the Python community!-), though the suggestion in @Jack's comment is way better than @Andrew's original. The string-based solutions, as expected, shing, with my endswith-avoiding one having a miniscule advantage over @Jack's (being just 15% faster). So, both pure-string ideas are good (as well as both being concise and clear) -- I prefer my variant a little bit only because I am, by character, a frugal (some might say, stingy;-) person... "waste not, want not"!-)

Alex Martelli
what do you have a space in the import a' 'a.xxx ?
Blankman
@Blankman, it's a bash command running Python: the setup (`-s`) is one argument, the code being timed the other. Each is quoted so I don't have to worry about it including spaces and/or special character, os course. You always separate arguments with spaces in bash (and most other shells, including Windows' own cmd.exe, so I'm pretty surprised at your question!), and quoting arguments to a shell command to preserve spaces and special characters within each argument is also definitely not what I would call a peculiar, rare, or advanced usage of any shell...!-)
Alex Martelli
Oh I see you've bypassed `endswith` as I mentioned in Jack's answer. Caching the len also avoids Python's (and C's!) terrible call overhead.
Matt Joiner
A: 

As kind of one liner generator joined:

test = """somestring='this is some string rec'
this is some string in the end word rec
This has not the word."""
match = 'rec'
print('\n'.join((line[:-len(match)] if line.endswith(match) else line)
      for line in test.splitlines()))
""" Output:
somestring='this is some string rec'
this is some string in the end word 
This has not the word.
"""
Tony Veijalainen