ansaurus

Question

Python: Import a file and convert to a list

Answer 1

A:

Hi there,

look at this: Input Output Python Documentation

The Keyword here is "readline".

If you have problems using it...reedit your question

bastianneu 2009-12-15 13:45:29

Answer 2

A:

If you want all of the values in a flat list, the code would look as follows:

ls=[]
for line in open( "input.txt", "r" ).readlines():
    for value in line.split( ' ' ):
        ls.append( value )

If you just want the lines in a list then you can stop at readlines().

j0rd4n 2009-12-15 13:46:09

i get a syntax error with the 'ls[]'

harpalss 2009-12-15 14:00:41

There was a typo in the post, its supposed to be ls=[], Ive updated the answer to reflect this.

mizipzor 2009-12-15 14:10:29

Any reason for all the downvotes on this? In my mind this is both simple and readable which is part of the Python mantra.

j0rd4n 2009-12-16 14:26:07

Answer 3

+3 A:

The following line will create a list where each item is a list. The inner list is one line thats split up into "words".

li = [i.strip().split() for i in open("input.txt").readlines()]

I put the code snippet you posted into a input.txt file in c:\temp and ran this line. Is the output similar to what you want?

C:\temp>python
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print([i.strip().split() for i in open("input.txt").readlines()])
[['p', 'wfgh', '1111', '11111', '111111'], ['287', '48', '0'], ['65626', '-1818', '0'], ['4654', '21512', '02020', '0']]

mizipzor 2009-12-15 13:49:19

You do not have to use readlines, open is an iterator already.Also, imho, it would be better to use with to open the file, and then use your creation.

SurDin 2009-12-15 13:53:14

the call to readlines() were mainly to make the example a bit more verbose :)

mizipzor 2009-12-15 13:54:54

I don't think strip is a valid method on a list.

j0rd4n 2009-12-15 13:56:17

@j0rd4n: Correct, its not, thats why I call it on string instances.

mizipzor 2009-12-15 13:58:44

Answer 4

A:

fh=open("file")
mylist=[]
header=fh.readline().rstrip()
if not header.startswith("p wncf") :
    print "error"
header=header.split()
mylist.append(header)
if len(header) != 5:
    print "error"
if False in map(str.isdigit, header[2:]):
    print "Error"
for line in fh:
    line=line.rstrip().split()
    if False in map(str.isdigit, line[0:2]):
        print "Error"            
    elif line[-1] != 0: 
        print "Error"
    else:
        mylist.append(line)
fh.close()

2009-12-15 15:41:01

Answer 5

+1 A:

fileName=open("d:/foo.bar")
lines = [i for i in fileName.readlines()]

hope that helps :D

Ahmad Dwaik 2009-12-15 15:54:57

Three things: 1) `list(x)` is a simpler and more concise way of turning an iterable into a list than `[item for item in x]`. 2) `readlines` already returns a list; you don't need to iterate over it to turn it into one. 3) It's probably not a good practice to create a variable called `fileName` and then put something in it that's not a filename.

Robert Rossney 2009-12-15 20:17:31

thanks robert for the notes

Ahmad Dwaik 2009-12-16 16:01:29

Answer 6

A:

p = open('filename')

#List:
linelist = [line for line in p.readlines()]

"""
But I prefer creating a dictionary as I find them more useful at times. Example here is very trivial. You can use the list index as a line number also.
"""

#Dictionary:
linedict = dict([(no, line) for no, line in enumerate(p.readlines())])

AJ 2009-12-15 16:15:31

Answer 7

A:

To build a list of only the lines in the file that contain at least two integers and end with a zero, use a regular expression:

import re
p = re.compile(r'^((\-?\d*\s+){2,})0$')
with open(filename, 'rb') as f:
    seq = [line.strip() for line in f if p.match(line)]

Robert Rossney 2009-12-15 21:00:41

Answer 8

A:

You're not providing all the details, but i assume that:

there's only 1 header line at the beginning and you don't need what's in it
the other lines only contain integers
you don't need to retain the trailing '0'

I have to also assume your file may be very big, so reading the entire file in memory, or storing the entire resulting list in memory is not a very good idea.

Here's a quick solution which reads the file line-by-line and uses a generator to yield each line as a list. You can use the entire result as one list if you want, like so:

result_list = read_data('foo.dat')

or you can do as i did in the example call and use each result line as it's read out. You can call this file directly if you're on linux, otherwise just associate it with the python interpreter, and call it with the name of the data file as the first argument, and it'll print the results line by line - this will work even if your file is humongous. You can also just import the file as a module and use the read_data method and use the results in other calculations.

Note that it does some error checking (the header line starts with a p and the data lines end with a 0, and only contain integers), and you probably want to either not do this checking at all, or raise a proper exception when they're encountered.

#!/usr/bin/env python
import sys

def read_data(fn):
    """Reads in datafile

    data file is in format:
        p wfgh 1111 11111 111111
        287 48 0
        65626 -1818 0
        4654 21512 02020 0
    where first line begins with p and is a header, and following lines
    are comprised of at least 2 integers plus a tailing 0.
    Pass in the filename, the resulting list of lists of integers will be 
    returned.
    """
    f = open(fn, 'r')
    # check for header line
    assert(f.readline().split()[0]=='p')
    for l in f:
        d = [int(col) for col in l.split()]
        if not d:
            # skip empty lines
            continue
        # check we have at least 2 integers and the last column is 0
        assert(d[-1] == 0 and len(d) >= 3)
        # yield current line
        yield d[:-1]

if __name__ == '__main__':
    for l in read_data(sys.argv[1]):
        print unicode(l)

wiswaud 2009-12-16 14:28:34

ansaurus

tags:

views:

answers:

Python: Import a file and convert to a list

related questions