views:

242

answers:

3
+2  Q: 

Page File Usage

I run a script which does text manipulation on the files system.

The script runs on text files ( .h, .cpp ).

As the script runs i see that the PF usage increases up until it reaches the amount of VM allocated for the page file.

Is there a way to flush the VM during the run or after it?

I have opend another question regarding it ( thought it was different issue ): http://stackoverflow.com/questions/1519948/sed-command-for-substitue

A: 

No, but maybe you can change the script to consume less memory.

Update. I have tried to reproduce the problem on Linux, corresponding to the script listed in the other question. In Bash:

while read fileName; do

    echo
    echo -----------------------------------------------
    echo For file $fileName :

    while read matchItem; do
      echo  Searching for  $matchItem
      echo
      sed -i "s/$matchItem/XXXXXXXXX $matchItem XXXXXXXXXXXXXX/" $fileName
    done < allFilesWithH.txt

done < all.txt

I have used fragments of a protein sequence database (large text file, FASTA format, up to 74 MB) and short peptide sequences for the test (such that there was at least 10 replaces per file). While it is running no process is using any significant memory (as I would expect). CPU load is on the order of 50% while it is running. Thus I can not reproduce the problem.

Peter Mortensen
It consume memory because it works on a large amount of files. The script run a single command ( sed ). Is there a memory leak problem in sed?P.S I work with sed on VxWorks development shell
Asaf
@Asaf: No, sed was designed when systems were very constrained in memory. It might be your script, but it is difficult to tell without knowing more about it.
Peter Mortensen
I have added a link in the question
Asaf
The script is a batch file ( BAT ) and the problem was that the PF Usage went too high during the run. I reaches to the max and i wasnt able to run more sed commands
Asaf
+1  A: 

Chunk or batch your operations so that you can use your memory more efficiently instead of just loading everything into memory. If none of your files are large, limit the number of threads that load text from these files into memory. If using large files, break them up to process them more efficiently using the memory you have.

Chris Ballance
Please see the way i use the script in the comment above. I cant chunk or batch it. The reason for consuming memory is because i run it for very large amount of files.
Asaf
At some point you are not releasing the memory you are using. Break the process in to managable pieces and your memory problems will become managable. Worse case scenario you can kill the process and hopefull the GC will flush the memory for you.
Chris Ballance
A: 

The pagefile is a system resource and can not be manipulated by any user process. In this case the pagefile increasing in size is simply a symptom of an application problem - the application is exceeding the commit limit. You must deal with the problem, not the symptom.

Larry Miller