ansaurus

Question

Answer 1

A:

File objects have a .readlines() method which will give you a list of the contents of the file, one line per list item. After that, you can just use normal list slicing techniques.

http://docs.python.org/library/stdtypes.html#file.readlines

Josh Wright 2010-01-17 17:18:33

Answer 2

+1 A:

How about this:

>>> with open('a', 'r') as fin: lines = fin.readlines()
>>> for i, line in enumerate(lines):
      if i > 30: break
      if i == 26: dox()
      if i == 30: doy()

Hamish Grubijan 2010-01-17 17:18:45

True, this is less efficient than the one by Alok, but mine uses a with statement ;)

Hamish Grubijan 2010-01-17 17:33:49

Answer 3

+9 A:

The quick answer:

f=open('filename')
lines=f.readlines()
print lines[26]
print lines[30]

or:

lines=[26,30]
i=0
f=open('filename')
for line in f:
    if i in lines:
        print i
        i+=1

There is a more elegant solution for extracting many lines: linecache (courtesy of "python: how to jump to a particular line in a huge text file?", a previous stackoverflow.com question).

Quoting the python documentation linked above:

>>> import linecache
>>> linecache.getline('/etc/passwd', 4)
'sys:x:3:3:sys:/dev:/bin/sh\n'

Change the 4 to your desired line number, and you're on. Note that 4 would bring the fifth line as the count is zero-based.

If the file might be very large, and cause problems when read into memory, it might be a good idea to take @Alok's advice and use enumerate().

To Conclude:

Use fileobject.readlines() or for line in fileobject as a quick solution for small files.
Use linecache for a more elegant solution, which will be quite fast for reading many files, possible repeatedly.
Take @Alok's advice and use enumerate() for files which might be very large, and won't fit into memory. Note that using this method might slow because the file is read sequentially.

Adam Matan 2010-01-17 17:18:47

Nice. I just looked at the source of `linecache` module, and looks like it reads the whole file in memory. So, if random access is more important than size optimization, `linecache` is the best method.

Alok 2010-01-17 17:36:57

Thanks for the options! :)

Nimbuz 2010-01-17 17:51:56

Your solution has an off by one error, btw :-)

Alok 2010-01-17 17:57:14

Thanks, corrected it.

Adam Matan 2010-01-17 18:00:50

Answer 4

+1 A:

If you don't mind importing then fileinput does exactly what you need (this is you can read the line number of the current line)

ennuikiller 2010-01-17 17:21:58

Answer 5

+11 A:

If the file to read is big, and you don't want to read the whole file in memory at once:

fp = open("file")
for i, line in enumerate(fp):
    if i == 25:
        # 26th line
    elif i == 29:
        # 30th line
    elif i > 29:
        break
fp.close()

Note that i == n-1 for the nth line.

Alok 2010-01-17 17:23:01

+1 Better solution than mine if the entire file isn't loaded into memory as in `linecache`. Are you sure that `enumerate(fp)` doesn't do that?

Adam Matan 2010-01-17 17:37:06

Haha. We both did +1 to each other's solutions. I like SO precisely because of things like this: learning from each other. :-)

Alok 2010-01-17 17:38:53

`enumerate(x)` uses `x.next`, so it doesn't need the entire file in memory.

Alok 2010-01-17 17:46:06

My small beef with this is that A) You want to use with instead of the open / close pair and thus keep the body short, B) But the body is not that short. Sounds like a trade-off between speed/space and being Pythonic. I am not sure what the best solution would be.

Hamish Grubijan 2010-01-17 17:53:25

Great. I like SO for that precise reason too. I'll add a link to your answer into mine.

Adam Matan 2010-01-17 17:53:26

with is overrated, python got along fine for over 13 years without it

Dan D 2010-09-10 04:19:33

Answer 6

A:

You can do a seek() call which positions your read head to a specified byte within the file. This won't help you unless you know exactly how many bytes (characters) are written in the file before the line you want to read. Perhaps your file is strictly formatted (each line is X number of bytes?) or, you could count the number of characters yourself (remember to include invisible characters like line breaks) if you really want the speed boost.

Otherwise, you do have to read every line prior to the line you desire, as per one of the many solutions already proposed here.

Roman Stolper 2010-01-17 17:26:08

Answer 7

+1 A:

def getitems(iterable, items):
  items = list(items) # get a list from any iterable and make our own copy
                      # since we modify it
  if items:
    items.sort()
    for n, v in enumerate(iterable):
      if n == items[0]:
        yield v
        items.pop(0)
        if not items:
          break

print list(getitems(open("/usr/share/dict/words"), [25, 29]))
# ['Abelson\n', 'Abernathy\n']
# note that index 25 is the 26th item

Roger Pate 2010-01-17 17:33:49

Roger, my favorite guy! This could benefit from a with statement.

Hamish Grubijan 2010-01-17 17:55:19

Answer 8

A:

f = open(filename, 'r')
totalLines = len(f.readlines())
f.close()
f = open(filename, 'r')

lineno = 1
while lineno < totalLines:
    line = f.readline()

    if lineno == 26:
        doLine26Commmand(line)

    elif lineno == 30:
        doLine30Commmand(line)

    lineno += 1
f.close()

inspectorG4dget 2010-01-17 17:52:11

this is as unpythonic as it gets.

SilentGhost 2010-01-17 17:55:04

Gives the wrong result, as you can't use readlines and readline like that (they each change the current read position).

Roger Pate 2010-01-17 18:02:02

I'm sorry for having overlooked a HUGE error in my first code. The error has been corrected and the current code should work as expected. Thanks for pointing out my error, Roger Pate.

inspectorG4dget 2010-01-17 22:55:05

Answer 9

A:

I prefer this approach because it's more general-purpose, i.e. you can use it on a file, on the result of f.readlines(), on a StringIO object, whatever:

def read_specific_lines(file, lines_to_read):
   """file is any iterable; lines_to_read is an iterable containing int values"""
   lines = set(lines_to_read)
   last = max(lines)
   for n, line in enumerate(file):
      if n + 1 in lines:
          yield line
      if n + 1 > last:
          return

>>> with open(r'c:\temp\words.txt') as f:
        [s for s in read_specific_lines(f, [1, 2, 3, 1000])]
['A\n', 'a\n', 'aa\n', 'accordant\n']

Robert Rossney 2010-01-17 18:37:36

Answer 10

+6 A:

A fast and compact approach could be:

def picklines(thefile, whatlines):
  return [x for i, x in enumerate(thefile) if i in whatlines]

this accepts any open file-like object thefile (leaving up to the caller whether it should be opened from a disk file, or via e.g a socket, or other file-like stream) and a set of zero-based line indices whatlines, and returns a list, with low memory footprint and reasonable speed. If the number of lines to be returned is huge, you might prefer a generator:

def yieldlines(thefile, whatlines):
  return (x for i, x in enumerate(thefile) if i in whatlines)

which is basically only good for looping upon -- note that the only difference comes from using rounded rather than square parentheses in the return statement, making a list comprehension and a generator expression respectively.

Further note that despite the mention of "lines" and "file" these functions are much, much more general -- they'll work on any iterable, be it an open file or any other, returning a list (or generator) of items based on their progressive item-numbers. So, I'd suggest using more appropriately general names;-).

Alex Martelli 2010-01-17 18:42:19

IMO, `for i, x in enumerate(thefile):` `if i in whatlines:` `yield x` (across three lines) reads clearer than returning a generator expression.

ephemient 2010-01-18 04:35:25

@ephemient, I disagree -- the genexp reads smoothly and perfectly.

Alex Martelli 2010-01-18 06:00:18

Answer 11

A:

@OP, you can use enumerate

for n,line in enumerate(open("file")):
    if n+1 in [26,30]: # or n in [25,29] 
       print line.rstrip()

ghostdog74 2010-01-18 00:32:05

Answer 12

+1 A:

Here's my little 2 cents, for what it's worth ;)

def indexLines(filename, lines=[2,4,6,8,10,12,3,5,7,1]):
    fp   = open(filename, "r")
    src  = fp.readlines()
    data = [(index, line) for index, line in enumerate(src) if index in lines]
    fp.close()
    return data


# Usage below
filename = "C:\\Your\\Path\\And\\Filename.txt"
for line in indexLines(filename): # using default list, specify your own list of lines otherwise
    print "Line: %s\nData: %s\n" % (line[0], line[1])

AWainb 2010-01-19 01:29:25

Answer 13

A:

if you want line 7

line = open("file.txt", "r").readlines()[7]

MadSc13ntist 2010-10-21 17:07:39

ansaurus

tags:

views:

answers:

Reading specific lines only (Python)

related questions