views:

36

answers:

1

I have a program that converts files. It takes an input file name and an output file name, reads from one and writes to the other, with some modifications. I have the source, but it is huge and I don't have the clear picture of its operation. I can tell that files are opened using _wsopen_s in read-write mode (_O_RDWR) and there does not seem to be any funny business going on at the opening time.

For some reason, when I try to use this program with a certain large input file, the following happens. First, the physical memory usage and the amount of memory used for the filesystem cache (as reported by the task manager) goes up. It seems that, for every byte read and byte written, memory usage goes up by 2 bytes. Note that this is the memory owned by the operating system: the commit size and the working set size of the actual executable stay constant.

Then, as the filesystem cache size approaches the total available memory, the OS starts swapping out running programs into the pagefile, and eventually the system becomes unresponsive.

Authors of the program say that they haven't seen this type of behavior and they don't know why this might be happening.

Before I officially raise this as a bug and try to get them to fix it (which may take a long time, and I need the program working now):

  • Is there anything in the source code that I can look for, some kind of sneaky function call, that could trigger this type of behavior?
  • Can I tell the operating system not to do that?
  • And, more generally, what the heck?

This is happening on x64 Vista Home Premium SP2.

A: 

If the program uses memory that is directly proportional to your input file, then it looks like the program tries to read everything into a memory buffer all at once.

A typical way to improve it is, read into a buffer, that is limited to 200kb, or larger, and do the processing, and write the converted data out, appending to the output file. This way, it will not exhaust the system memory.

動靜能量
System file caching won't be affected by size of reads. Until there is unused RAM, why not use it? It's strange however why caching took priority over the running programs..
ruslik
The program itself does not use memory, the operating system does!