ansaurus

Question

Fast way to add line/row number to text file

Answer 1

+4 A:

oh god no, don't read all 12 million lines in at once! If you're going to use Python, at least do it this way:

file = open('file.cas', 'r')
try:
    output = open('output.cas', 'w')
    try:
        output.writelines('%d %s' % tpl for tpl in enumerate(file))
    finally:
        output.close()
finally:
    file.close()

That uses a generator expression which runs through the file processing one line at a time.

David Zaslavsky 2009-08-13 20:05:01

haha thank you. Is there a way to read a file in python by parts?, or work in one in a more low level manner. This is way i was thinking about doing it in C.Thanks

Alan FL 2009-08-13 20:11:26

David's method does read the file by parts - it reads and writes one line at a time.

Andy Balaam 2009-08-13 20:20:30

Yeah, Andy's right - though the fact that it reads one line at a time is deeply disguised in Python voodoo ;-) Or at least it looks like voodoo if you're not used to it.

David Zaslavsky 2009-08-13 20:57:57

I feel a great disturbance in the force, as if a million RAM chips wept in horror, and were suddenly silenced.

Stefano Borini 2009-08-13 21:01:18

haha.. wow i'm always amazed by python... i think i sort of get it, any ideas how i could change it so i could get the numbers in a fixed lenght? thanks a lot btw

Alan FL 2009-08-13 21:25:32

i think for now i'll use your solution, it doesn't take too long and now that i sort of get how it works might be more durable.Thanks a lot!

Alan FL 2009-08-13 21:57:17

For a fixed length, I think it's `%12d` for 12 places. (Or %9d for 9 places, etc.)

David Zaslavsky 2009-08-14 17:18:50

Answer 2

+2 A:

Why don't you try cat -n ?

Stefano Borini 2009-08-13 21:02:05

i will definitely try it, but for now i can't use cat.

Alan FL 2009-08-13 21:47:26

ok, then the solution David proposes is good, albeit as plafayette points out, it could be slower than cat.

Stefano Borini 2009-08-13 21:56:15

i just finish making few tests with a smaller sample (2M records), and both solutions (yours and david's) and both take the exact time to run. Tomorrow I'll test what happens with 12M but im not expecting much of a difference.thank you

Alan FL 2009-08-13 23:53:10

this is interesting... means that python is fast as C (doesn't surprise me, but it's nice to see it in practice)

Stefano Borini 2009-08-14 00:17:28

Answer 3

+2 A:

Stefano is right:

$ time cat -n file.cas > output.cas

Use time just so you can see how fast it is. It'll be faster than python since cat is pure C code.

Pierre-Antoine LaFayette 2009-08-13 21:04:47

ansaurus

tags:

views:

answers:

Fast way to add line/row number to text file

related questions