My program is experiencing a nasty drop in performance. It is basically a pair of nested for loops which do an operation of a pair of data sets and then writes the result. The problem is, after about 500 of the 300,000 pairs it slows from taking .07 seconds/pair to 5 seconds/pair and CPU usage drops from nearly 100% to ~4%. All memory used throughout is allocated before the nested loops and freed after the loops.
Here's pseudo code so you can hopefully get the idea:
for (i=0; i<759; i++) {
read_binary_data(data_file_1, data_1);
read_binary_header(header_file_1, header_1);
for (j=i+1; j<760;j++) {
read_binary_data(data_file_2, data_2);
read_binary_header(header_file_2, header_2);
do_operation(data_1, data_2, out_data);
update_header_data(header_1, header_2, out_header);
write_binary_data_and_header(out_data, out_header);
}
}
I've put in timing flags at the beginning and end of the second for loop to see the timing quoted above, but I was wondering if there might be better debugging options to show me why the operation is slowing down. The only thoughts so far I've had is file system blocking, but I only open 5-6 files on each run and each is closed at the end of its subroutine.
Update at 10:15 P.M. Pacific time:
After various tests, I've found the culprit seems to be in the read_binary_data portion. It can take over 3 seconds for many files. I'm going to attempt to pack all of the binary data into 1 file and read it all at once so I only need the one read. I'm betting I'll run out of memory, but its worth a shot and if that happens, I'll just be less ambitious and try to do less than 760 * 2 * 31 * 43201 floats in an array at a time (I guess that should be around 16 GB?).