views:

338

answers:

6

Hi folks,

How does one ignore lines in a file?

Example:

If you know that the first lines in a file will begin with say, a or b and the remainder of lines end with c, how does one parse the file so that lines beginning a or b are ignored and lines ending c are converted to a nested list?

What I have so far:

fname = raw_input('Enter file name: ')

z = open(fname, 'r')

#I tried this but it converts all lines to a nested list

z_list = [i.strip().split() for i in z]

I am guessing that I need a for loop.

for line in z:
    if line[0] == 'a':
        pass
    if line[0] == 'b':
        pass
    if line[-1] == 'c':
        list_1 = [line.strip().split()]

The above is the general idea but I am expert at making dead code! How does one render it undead?

Thanks, Seafoid.

+1  A: 

One way to do it is to replace 'pass' with 'continue'. This will continue to the next line in the file without doing anything. You will also need to append line to list_1

if line[-1] == 'c':
    list_1.append([line.strip().split()])
Mongoose
+2  A: 

You can add if conditions to list comprehensions.

z_list = [i.strip().split() for i in z if i[-1] == 'c']

or

z_list = [i.strip().split() for i in z if (i[0] <> 'a' and i[0] <> 'b')]
Amber
<> is deprecated in favor of !=, and startswith and endswith are clearer in this context.
Cory Petosky
True. The main concept here was mostly that stated in the first sentence - the fact that you can filter list comprehensions via `if` conditions.
Amber
+7  A: 

startswith can take a tuple of strings to match, so you can do this:

[line.strip().split() for line in z if not line.startswith(('a', 'b'))]

This will work even if a and b are words or sentences not just characters. If there can be cases where lines don't start with a or b but also don't end with c you can extend the list comprehension to this:

[
    line.strip().split()
    for line in z if line.endswith('c') and not line.startswith(('a', 'b'))
]
Nadia Alramli
Trying to think about how you could make this shorter... but I don't think you can.
Skurmedel
I like this and actually understand it. Unfortunately, it prints an empty list. I must be doing something wrong!
Seafoid
@Seafoid, are you sure? I just tried it on a local file and it worked just fine. Maybe you are reading the file twice? read only works once
Nadia Alramli
Eventually! Thank you Nadia!
Seafoid
You real computer scientists must detest us lowly biologists making a mess of your territory!!!
Seafoid
No, it is actually a pleasure to see people outside IT programming :)
Nadia Alramli
`line.strip()` is unnecessary if you use `line.{startswith,endswith}(nonwhitespace)`.
J.F. Sebastian
+3  A: 

One very general approach is to "filter" the file by removing some lines:

import itertools
zlist = [l.strip.split() for l in itertools.ifilter(lambda line: line[0] not in 'ab', z)]

You can use itertools.ifilter any time you want to "selectively filter" an iterable, getting another iterable which only contains those items which satisfy some predicate -- which is why I say this approach is very general. itertools has a lot of great, fast tools for dealing with iterables in a myriad way, and is well worth studying.

A similar but syntactically simpler approach, which suffices in your case (and which therefore I would recommend due to the virtue of simplicity), is to do the "filtering" with an if clause in the listcomp:

zlist = [l.strip.split() for l in z if l[0] not in 'ab']
Alex Martelli
I am still grappling with python basics so I am trying avoid importing modules. I need to reinvent the wheel in order to learn code! I will note this for when I am more proficient. Thanks!
Seafoid
A: 
f=open("file")
for line in f:
   li=line.strip()
   if not li[0] in ["a","b"] and li[-1]=="c":
      print line.rstrip()
f.close()
A: 

For those interested in the solution.

And also, another question!

Example file format:

c this is a comment
p m 1468 1 267
260 32 0
8 1 0

Code:

fname = raw_input('Please enter the name of file: ')

z = open(fname, 'r')

required_list = [line.strip().split() for line in z if not line.startswith(('c', 'p'))]

print required_list

Output:

[['260', '32', '0'], ['8', '1', '0']]

Any suggestions on how to convert the strings in the lists to integers and perform arithmetic operations?

Pseudocode to illustrate:

#for the second item in each sublist
     #if sum is > than first number in second line of file
         #pass
     #else
         #abort/raise error

Cheers folks for your suggestions so far, Seafoid.

@Nadia, my day seems a little more worthwhile now! I spent hours (days even) trying to crack this solo! Thanks!

Seafoid
int() converts a valid string (or another valid representation, like a float) into a integer. A short and naive solution to convert a list of strings into numbers would be map(int, numberStrings). This maps the function "int" to every string in the list. Just watch out for invalid values, that will yield an exception.
Skurmedel
@Seafoid: Please, use "Ask Question" button to ask new questions (if questions are related then insert a link (each question/answer has url)).
J.F. Sebastian