I have an application that analyzes data from input files that are generated by our core system. Depending on the client, that file can vary in size (files contain online marketing metrics such as clicks, impressions, etc..). One of our clients has a website that gets a fairly large amount of traffic, and the metric files generated are around 3-4 megabytes in size. This application currently analyzes three files at a time, each file being a different time aggregate.
I'm reading in the file using a CSV iterator, and it stores the contents of the entire file into a multi-dimensional array. The array for one of the particular files is around 16000 elements long, with each subarray being 31 elements. The dataprocessor object that handles loading this data utilizes about 50MB of memory. Currently the PHP memory limit is set to 100MB. Unfortunately the server this application is on is old and can't handle much of a memory increase.
So this brings me to the question: how can I optimize processing a file this size?
Could a possible optimization be reading in parts of the file, calculate, store, repeat?