views:

37

answers:

3

Question: How do you write data to an already existing file at the beginning of the file with out writing over what's already there and with out reading the entire file into memory? (e.g. prepend)


Info:

I'm working on a project right now where the program frequently dumps data into a file. this file will very quickly balloon up to 3-4gb. I'm running this simulation on a computer with only 768mb of ram. pulling all that data to the ram over and over will be a great pain and a huge waste of time. The simulation already takes long enough to run as it is.

The file is structured such that the number of dumps it makes is listed at the beginning with just a simple value, like 6. each time the program makes a new dump I want that to be incremented, so now it's 7. the problem lies with the 10th, 100th, 1000th, and so dump. the program will enter the 10 just fine, but remove the first letter of the next line:

"9\n580,2995,2083,028\n..."

"10\n80,2995,2083,028\n..."

obviously, the difference between 580 and 80 in this case is significant. I can't lose these values. so i need a way to add a little space in there so that I can add in this new data without losing my data or having to pull the entire file up and then rewrite it.

Basically what I'm looking for is a kind of prepend function. something to add data to the beginning of a file instead of the end.

Programmed in Python

~n

+2  A: 

See the answers to this question: http://stackoverflow.com/questions/125703/how-do-i-modify-a-text-file-in-python

Summary: you can't do it without reading the file in (this is due to how the operating system works, rather than a Python limitation)

msanders
A: 

You could quite easily create an new file, output the data you wish to prepend to that file and then copy the content of the existing file and append it to the new one, then rename.

This would prevent having to read the whole file if that is the primary issue.

thomasfedb
the fact that you said "copy" makes me wonder how exactly does it work that I'm not reading the entire file? from what you said it sounds i may not be taking it to ram, but if I can avoid as much reading as possible, I'd like to do so.
Narcolapser
Yes, I am still suggesting that you read the entire file, as the above poster seems to think is the only possible way, however this method prevents you from having the entire file in memory.
thomasfedb
+1  A: 

It's not addressing your original question, but here are some possible workarounds:

  • Use SQLite (it's bundled with your Python)
  • Use a fancier database, either RDBMS or NoSQL
  • Just track the number of dumps in a different text file

The first couple of options are a little more work up front, but provide more flexibility. The last option is the easiest solution to your current problem.

Hank Gay
While you didn't exactly answer my question, you did give me the solution to my problem. The idea of using a database instead of a file never occurred to me, but it would provide a much faster, smaller, and in the long run easier system that having to parse all this data out of a text file. thanks.
Narcolapser