ansaurus

Question

Answer 1

+6 A:

Generators are a good idea, but you seem to miss the biggest problem:

(str for str in splitter.split(open(fn).read()) if str and str <> 'A' and str <> 'S')

You're reading the whole file in at once even if you only need to work with bits at a time. You're code is too complicated for me to fix, but you should be able to use file's iterator for your task:

(line for line in open(fn))

2009-06-11 01:39:01

Agreed. The code is pretty convoluted. Any chance you (Blair) could rewrite the code to parse the file a line at time using the file "readline()" function. That might alleviate some problems if "data.txt" or other files are large.

DoxaLogos 2009-06-11 03:33:26

Ah, ok--thank you! I didn't know the I could use the file's iterator. That is very helpful.

twneale 2009-06-11 22:16:26

Answer 2

+1 A:

Try to comment out add(record) to see if the problem is in your code or on the database side. All the records are added in one transaction (if supported) and maybe this leads to a problem if it get too many records. If commenting out add(record) helps, you could try to call commit() from time to time.

sth 2009-06-11 02:00:43

Terrific, thank you very much. I changed the code to commit after each add and the problem seems to be solved. Fantastico!

twneale 2009-06-11 22:17:56

Answer 3

+1 A:

This isn't a Python memory issue, but perhaps it's worth thinking about. The previous answers make me think you'll sort that issue out quickly.

I wonder about the rollback logs in MySQL. If a single transaction is too large, perhaps you can checkpoint chunks. Commit each chunk separately instead of trying to rollback a 15MB file's worth.

duffymo 2009-06-11 02:08:53

Thanks for your insight--this appears to have been a part of the problem. I changed my code to commit after adding each small record and the problem seemsto have disappeared.

twneale 2009-06-11 22:17:15

ansaurus

tags:

views:

answers:

Memory Error in python

related questions