I'm using a for loop to read a file, but I only want to read specific lines, say line #26 and #30. Is there any built-in feature to achieve this?
Thanks
I'm using a for loop to read a file, but I only want to read specific lines, say line #26 and #30. Is there any built-in feature to achieve this?
Thanks
File objects have a .readlines() method which will give you a list of the contents of the file, one line per list item. After that, you can just use normal list slicing techniques.
How about this:
>>> with open('a', 'r') as fin: lines = fin.readlines()
>>> for i, line in enumerate(lines):
if i > 30: break
if i == 26: dox()
if i == 30: doy()
The quick answer:
f=open('filename')
lines=f.readlines()
print lines[26]
print lines[30]
or:
lines=[26,30]
i=0
f=open('filename')
for line in f:
if i in lines:
print i
i+=1
There is a more elegant solution for extracting many lines: linecache (courtesy of "python: how to jump to a particular line in a huge text file?", a previous stackoverflow.com question).
Quoting the python documentation linked above:
>>> import linecache
>>> linecache.getline('/etc/passwd', 4)
'sys:x:3:3:sys:/dev:/bin/sh\n'
Change the 4
to your desired line number, and you're on. Note that 4 would bring the fifth line as the count is zero-based.
If the file might be very large, and cause problems when read into memory, it might be a good idea to take @Alok's advice and use enumerate().
To Conclude:
fileobject.readlines()
or for line in fileobject
as a quick solution for small files. linecache
for a more elegant solution, which will be quite fast for reading many files, possible repeatedly.enumerate()
for files which might be very large, and won't fit into memory. Note that using this method might slow because the file is read sequentially.If you don't mind importing then fileinput does exactly what you need (this is you can read the line number of the current line)
If the file to read is big, and you don't want to read the whole file in memory at once:
fp = open("file")
for i, line in enumerate(fp):
if i == 25:
# 26th line
elif i == 29:
# 30th line
elif i > 29:
break
fp.close()
Note that i == n-1
for the n
th line.
You can do a seek() call which positions your read head to a specified byte within the file. This won't help you unless you know exactly how many bytes (characters) are written in the file before the line you want to read. Perhaps your file is strictly formatted (each line is X number of bytes?) or, you could count the number of characters yourself (remember to include invisible characters like line breaks) if you really want the speed boost.
Otherwise, you do have to read every line prior to the line you desire, as per one of the many solutions already proposed here.
def getitems(iterable, items):
items = list(items) # get a list from any iterable and make our own copy
# since we modify it
if items:
items.sort()
for n, v in enumerate(iterable):
if n == items[0]:
yield v
items.pop(0)
if not items:
break
print list(getitems(open("/usr/share/dict/words"), [25, 29]))
# ['Abelson\n', 'Abernathy\n']
# note that index 25 is the 26th item
f = open(filename, 'r')
totalLines = len(f.readlines())
f.close()
f = open(filename, 'r')
lineno = 1
while lineno < totalLines:
line = f.readline()
if lineno == 26:
doLine26Commmand(line)
elif lineno == 30:
doLine30Commmand(line)
lineno += 1
f.close()
I prefer this approach because it's more general-purpose, i.e. you can use it on a file, on the result of f.readlines()
, on a StringIO
object, whatever:
def read_specific_lines(file, lines_to_read):
"""file is any iterable; lines_to_read is an iterable containing int values"""
lines = set(lines_to_read)
last = max(lines)
for n, line in enumerate(file):
if n + 1 in lines:
yield line
if n + 1 > last:
return
>>> with open(r'c:\temp\words.txt') as f:
[s for s in read_specific_lines(f, [1, 2, 3, 1000])]
['A\n', 'a\n', 'aa\n', 'accordant\n']
A fast and compact approach could be:
def picklines(thefile, whatlines):
return [x for i, x in enumerate(thefile) if i in whatlines]
this accepts any open file-like object thefile
(leaving up to the caller whether it should be opened from a disk file, or via e.g a socket, or other file-like stream) and a set of zero-based line indices whatlines
, and returns a list, with low memory footprint and reasonable speed. If the number of lines to be returned is huge, you might prefer a generator:
def yieldlines(thefile, whatlines):
return (x for i, x in enumerate(thefile) if i in whatlines)
which is basically only good for looping upon -- note that the only difference comes from using rounded rather than square parentheses in the return
statement, making a list comprehension and a generator expression respectively.
Further note that despite the mention of "lines" and "file" these functions are much, much more general -- they'll work on any iterable, be it an open file or any other, returning a list (or generator) of items based on their progressive item-numbers. So, I'd suggest using more appropriately general names;-).
@OP, you can use enumerate
for n,line in enumerate(open("file")):
if n+1 in [26,30]: # or n in [25,29]
print line.rstrip()
Here's my little 2 cents, for what it's worth ;)
def indexLines(filename, lines=[2,4,6,8,10,12,3,5,7,1]):
fp = open(filename, "r")
src = fp.readlines()
data = [(index, line) for index, line in enumerate(src) if index in lines]
fp.close()
return data
# Usage below
filename = "C:\\Your\\Path\\And\\Filename.txt"
for line in indexLines(filename): # using default list, specify your own list of lines otherwise
print "Line: %s\nData: %s\n" % (line[0], line[1])