I have a Python program that processes fairly large NumPy arrays (in the hundreds of megabytes), which are stored on disk in pickle files (one ~100MB array per file). When I want to run a query on the data I load the entire array, via pickle, and then perform the query (so that from the perspective of the Python program the entire array is in memory, even if the OS is swapping it out). I did this mainly because I believed that being able to use vectorized operations on NumPy arrays would be substantially faster than using for loops through each item.
I'm running this on a web server which has memory limits that I quickly run up against. I have many different kinds of queries that I run on the data so writing "chunking" code which loads portions of the data from separate pickle files, processes them, and then proceeds to the next chunk would likely add a lot of complexity. It'd definitely be preferable to make this "chunking" transparent to any function that processes these large arrays.
It seems like the ideal solution would be something like a generator which periodically loaded a block of the data from the disk and then passed the array values out one by one. This would substantially reduce the amount of memory required by the program without requiring any extra work on the part of the individual query functions. Is it possible to do something like this?