views:

118

answers:

7

Hi,

is it possible in Python, given a file with 10000 lines, where all of them have this structure:

1, 2, xvfrt ert5a fsfs4 df f fdfd56 , 234

or similar, to read the whole string, and then to store in another string all characters from column 7 to column 17, including spaces, so the new string would be

"xvfrt ert5a" ?

Thanks a lot

A: 
for l in open("myfile.txt"):
   c7_17 = l[6:17]
   # Not sure what you want to do with c7_17 here, but go for it!
Ned Batchelder
+5  A: 
another_list = []
for line in f:
    another_list.append(line[6:17])

Or as a generator (a memory friendly solution):

another_list = (line[6:17] for line in f)
Nadia Alramli
Actually it should be [6:17]. +1 for the generator version!
ema
@ema, right! thanks for the correction
Nadia Alramli
+1  A: 

You don't say how you want to store the data from each of the 10,000 lines -- if you want them in a list, you would do something like this:

my_list = []

for line in open(filename):
    my_list.append(line[7:18])
Ian Clelland
+7  A: 
lst = [line[6:17] for line in open(fname)]
SilentGhost
A: 

This functionw will compute the string that you want and print it out

def readCols(filepath):
    f = open(filepath, 'r')
        for line in file:
            newString = line[6:17]
            print newString
inspectorG4dget
+1  A: 

This technically answers the direct question:

lst = [line[6:17] for line in open(fname)]

but there is a fatal flaw. It is OK for throwaway code, but that data looks suspiciously like comma separated values, and the third field may even be space delimited chunks of data. Far better to do it like this so that if the first two columns sprout an extra digit, it will still work:

lst = [x[2].strip()[0:11] for x in [line.split(',') for line in open(fname)]]

And if those space delimited chunks might get longer, then this:

lst = [x[2].strip().split()[0:2] for x in [line.split(',') for line in open(fname)]]

Don't forget a comment or two to explain what is going on. Perhaps:

# on each line, get the 3rd comma-delimited field and break out the 
# first two space-separated chunks of the licence key

Assuming, of course, that those are licence keys. No need to be too abstract in comments.

Michael Dillon
+2  A: 

I'm going to take Michael Dillon's answer a little further. If by "columns 6 through 17" you mean "the first 11 characters of the third comma-separated field", this is a good opportunity to use the csv module. Also, for Python 2.6 and above it's considered best practice to use the 'with' statement when opening files. Behold:

import csv
with open(filepath, 'rt') as f:
  lst = [row[2][:11] for row in csv.reader(f)]

This will preserve leading whitespace; if you don't want that, change the last line to

  lst = [row[2].lstrip()[:11] for row in csv.reader(f)]
Larry Hastings