My problem is quite simple: I have a 400MB file filled with 10,000,000 lines of data. I need to iterate over each line, do something, and remove the line from memory to avoid filling-up too much RAM.
Since my machine has several processor, my initial idea to optimize this process was to create two different processes. One would read the file several lines at a time and gradually fill a list (one element of the list being one line in the file). The other would have access to this same list and would pop() elements out of it and process them. This would effectively create a list that would grow from one side and shrink from the other.
In other words, this mechanism should implement a buffer that would constantly be populated with lines for the second process to crunch. But maybe this is no faster than using:
for line in open('/data/workfile', 'r'):