ansaurus

Question

Help removing items from a text file using python

Answer 1

A:

Your file is organized in an unfortunate manner for Pythonic processing.

Note that when you call reader.read(), you are reading the entire file into memory. Let's say this takes up X bytes.

Calling split will effectively add another X bytes of memory usage, as it will create a new string for each separate string in the file.

Then you call row[:x] and row[x:], which will add ANOTHER X bytes (because the slice operator makes a copy).

Then you call zip, and make a list comprehension, etc, etc. Strings and tuples are immutable data, which means you are always creating them from scratch.

I would approach this problem at a lower level. Open one file descriptor and point it to the beginning of the file. Open another and have it seek to the beginning of the (na/0/1/2) values (you will know where this is by counting the spaces). Now, read one name and one value at a time, and if the value is not "na" you can write the name to an output file. If you need to write the values to the output file also, hold them in memory and write them all at once when you are done.

Unfortunately this will be more difficult to code than just using the high-level functions that Python provides (you will need to write code that operates at the character level), but as you have seen there is a price to pay for those high-level functions.

danben 2010-08-02 15:46:04

Answer 2

+1 A:

What you should do is break your file up into two separate files. Your logic should do something like this:

Open data file
open name file
read next data
is it name? see 5. Otherwise see 6
write name to name file, see 3
is it number or na? close name file and open number file
read next data
is it number or na? see 7, otherwise write file

once you have your files split into two pieces, you can iterate over them together:

names = open('names.txt')
numbers = open('numbers.txt')

for name, number in zip(names, numbers):
   if not numbers == 'na':
       output.write(name + " " + number)

or you could write to two different files and then join them together if that's what you need.

Wayne Werner 2010-08-02 15:50:16

Since it appears that his data is a huge list of names followed by a huge list of numbers, he could probably even do the splitting up in a good text editor. It is also worth noting that this approach requires names and numbers to have each name/number on a separate line.

Wilduck 2010-08-02 15:54:02

Can you reccomend a good text editor?

Robert A. Fettikowski 2010-08-02 16:38:16

any of them? Notepad++ is a simple one for beginners. I personally use Vim (www.vim.org) which has a pretty steep learning curve, but is incredibly useful once you get it down.

Wayne Werner 2010-08-02 17:48:05

ansaurus

tags:

views:

answers:

Help removing items from a text file using python

related questions