views:

397

answers:

3

Hi guys, Pretty new to Python so be nice please!

I have a string of text-how do I remove all the text after some certain characters? The text after will change so I want to remove after some text which doesn't change.

Hope that makes sense!

Thanks

+2  A: 

Without a RE (which I assume is what you want):

def remafterellipsis(text):
  where_ellipsis = text.find('...')
  if where_ellipsis == -1:
    return text
  return text[:where_ellipsis + 3]

or, with a RE:

import re

def remwithre(text, there=re.compile(re.escape('...')+'.*')):
  return there.sub('', text)
Alex Martelli
Might want to use sep='...' as a kwarg and use len(sep) instead of hard-coding the 3 to make it slightly more future-proof.
cdleary
Yep, but then you need to recompile the RE on each call, so performance suffers for the RE solution (no real difference for the non-RE solution). Some generality is free, some isn't...;-)
Alex Martelli
@Alex - Thanks for testing the solutions!
Ayman Hourieh
+4  A: 

Split on your separator at most once, and take the first piece:

sep = '...'
rest = text.split(sep, 1)[0]

You didn't say what should happen if the separator isn't present. Both this and Alex's solution will return the entire string in that case.

Ned Batchelder
Request is "remove all the text after" the separator, not "get" that text, so I think you want [0], not [-1], in your otherwise excellent solution.
Alex Martelli
Solihull
+7  A: 

Assuming your separator is '...', but it can be any string.

text = 'some string... this part will be removed.'
head, sep, tail = text.partition('...')

>>> print head
some string

If the separator is not found, head will contain all of the original string.

The partition function was added in Python 2.5.

partition(...) S.partition(sep) -> (head, sep, tail)

Searches for the separator sep in S, and returns the part before it,
the separator itself, and the part after it.  If the separator is not
found, returns S and two empty strings.
Ayman Hourieh
Yet another excellent solution -- are we violating TOOOWTDI?-) Maybe worth a timeit run to check...
Alex Martelli
.partition wins -- 0.756 usec per loop, vs 1.13 for .split (comment formatting doesn't really let me show the exact tests, but I'm using @Ayman's text and separator) -- so, +1 for @Ayman's answer!
Alex Martelli
and btw, for completeness, the RE-based solution is 2.54 usec, i.e., way slower than either @Ayman's or @Ned's.
Alex Martelli
partition wins if you're in 2.5 land :) For us suckers stuck in 2.4, we have to live with relatively glacial slowness of split.
Gregg Lind