How can I delete the \n
and the following letters? Thanks a lot.
wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']
for x in wordlist:
...?
How can I delete the \n
and the following letters? Thanks a lot.
wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']
for x in wordlist:
...?
>>> import re
>>> wordlist = ['Schreiben\nEs', 'Schreiben', \
'Schreiben\nEventuell', 'Schreiben\nHaruki']
>>> [ re.sub("\n.*", "", word) for word in wordlist ]
['Schreiben', 'Schreiben', 'Schreiben', 'Schreiben']
Done via re.sub
:
>>> help(re.sub)
1 Help on function sub in module re:
2
3 sub(pattern, repl, string, count=0)
4 Return the string obtained by replacing the leftmost
5 non-overlapping occurrences of the pattern in string by the
6 replacement repl. repl can be either a string or a callable;
7 if a callable, it's passed the match object and must return
8 a replacement string to be used.
You could use a regular expression to do so:
import re
wordlist = [re.sub("\n.*", "", word) for word in wordlist]
The regular expression \n.*
matches the first \n
and anything that might follow (.*
) and replaces it with nothing.
[w[:w.find('\n')] fow w in wordlist]
few tests:
$ python -m timeit -s "wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']" "[w[:w.find('\n')] for w in wordlist]"
100000 loops, best of 3: 2.03 usec per loop
$ python -m timeit -s "import re; wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']" "[re.sub('\n.*', '', w) for w in wordlist]"
10000 loops, best of 3: 17.5 usec per loop
$ python -m timeit -s "import re; RE = re.compile('\n.*'); wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']" "[RE.sub('', w) for w in wordlist]"
100000 loops, best of 3: 6.76 usec per loop
Edit:
The solution above is completely wrong (see the comment from Peter Hansen). here the corrected one:
def truncate(words, s):
for w in words:
i = w.find(s)
yield w[:i] if i != -1 else w
>>> wordlist = ['Schreiben\nEs', 'Schreiben', 'Schreiben\nEventuell', 'Schreiben\nHaruki']
>>> [ i.split("\n")[0] for i in wordlist ]
['Schreiben', 'Schreiben', 'Schreiben', 'Schreiben']