ansaurus

Question

Spliting a file into lines in Python using re.split

Answer 1

+2 A:

lines = file.readlines()

edit: or if you didnt want blank lines in there, you can do

lines = filter(lambda a:(a!='\n'), file.readlines())

edit^2: to remove trailing newines, you can do

lines = [re.sub('\n','',line) for line in filter(lambda a:(a!='\n'), file.readlines())]

Alex 2009-05-04 03:40:00

Leaves the trailing newlines on. I'm not sure if that's an issue for OP.

Dave 2009-05-04 03:41:40

file.readlines() is not quite the same...It includes the newline at the end of each line, and includes empty lines.

Ashton 2009-05-04 03:42:53

That is enough code to work with. Thanks a lot :D

Ashton 2009-05-04 03:48:29

Answer 2

+5 A:

Put the regular expression hammer away :-)

You can iterate over a file directly; readlines() is almost obsolete these days.
Read about str.strip() (and its friends, lstrip() and rstrip()).
Don't use file as a variable name. It's bad form, because file is a built-in function.

You can write your code as:

lines = []
f = open(filename)
for line in f:
    if not line.startswith('com'):
        lines.append(line.strip())

If you are still getting blank lines in there, you can add in a test:

lines = []
f = open(filename)
for line in f:
    if line.strip() and not line.startswith('com'):
        lines.append(line.strip())

If you really want it in one line:

lines = [line.strip() for line in open(filename) if line.strip() and not line.startswith('com')]

Finally, if you're on python 2.6, look at the with statement to improve things a little more.

John Fouhy 2009-05-04 04:14:20

I haven't written any Python since last year, and I'm recovering from a short but nasty bout of PERL.Thanks to your answer I'm getting back in the mindset :)

Ashton 2009-05-04 04:52:13

Answer 3

A:

This should work, and eliminate the regular expressions as well:

all_lines = (line.rstrip()
             for line in open(filename)
             if "com" not in line)
# filter out the empty lines
lines = filter(lambda x : x, all_lines)

Since you're using a list comprehension and not a generator expression (so the whole file gets loaded into memory anyway), here's a shortcut that avoids code to filter out empty lines:

lines = [line
     for line in open(filename).read().splitlines()
     if "com" not in line]

Ryan Ginstrom 2009-05-04 05:21:06

Instead of 'filter(lambda x: x, all_lines)', you can just write 'filter(None, all_lines)'. Although I've never been totally happy with that short-cut :-)

John Fouhy 2009-05-06 00:12:58

Answer 4

+1 A:

blackkettle 2009-05-04 06:26:10

ansaurus

tags:

views:

answers:

Spliting a file into lines in Python using re.split

related questions