views:

223

answers:

3

I have a Windows batch file that processes all the files in a given directory. I have 206,783 files I need to process:

for %%f in (*.xml) do call :PROCESS %%f
goto :STOP

:PROCESS
:: do something with the file
program.exe %1 > %1.new
set /a COUNTER=%COUNTER%+1
goto :EOF

:STOP
@echo %COUNTER% files processed

When I run the batch file, the following output is written:

65535 files processed

As part of the processing, an output file is created for each file procesed, with a .new extension. When I do a dir *.new it reports 65,535 files exist.

So, it appears my command environment has a hard limit on the number of files it can recognize, and that limit is 64K - 1.

  1. Is there a way to extend the command environment to manage more than 64K - 1 files?
  2. If not, would a VBScript or JavaScript be able to process all 206,783 files?

I'm running on Windows 2003 server, Enterprise Edition, 32-bit.


UPDATE

It looks like the root cause of my issue was with the built-in Windows "extract" command for ZIP files.

The files I have to process were copied from another system via a ZIP file. My server doesn't have a ZIP utility installed, just the native Windows commands. I right-clicked on the ZIP file, and did an "Extract all...", which apparently just extracted the first 65,535 files.

I downloaded and installed 7-zip onto my server, unzipped all the files, and my batch script worked as intended.

A: 
  1. if program.exe is in-built, you can refactor it to take in arguments so that you can do away with the for loop
  2. you can store you output files into different directories instead of creating onto the same directory
  3. you can group your outputs into categories, so you have less output files to deal with.
ghostdog74
Side note to 1: The program should accept wildcards in that case and not just a list of file names (which would end even earlier due to the 8190 character limit for command lines in batch files).
Joey
A: 

Two options:

1) I suggest you to add a "move" after the .exe processing, so that your batch file can be relaunched and it will process only the files which are still in the original directory. This is a good idea regardless of the actual size limit, so you don't risk to have to reprocess stuff in case your batch is interrupted or power goes off etc.

2) Use another scripting language, like a windows Perl interpreter, or maybe WSH.

p.marino
+1  A: 

Another option might be to iterate over the output of dir instead of directly over the files. I usually hate it when people do this, but apparently there are limitations to standard iterating idioms.

for /f "delims=" %%f in ('dir /b *.xml') do call :PROCESS %%f 

I'm currently trying this out, but it might take a while; just filled a directory with 100k files.

But keep in mind that using the output of a command has problems with Unicode if you're using Raster fonts, so make sure that your console window has Lucida Console or another TrueType font set. Otherwise Unicode characters get resolved to question marks or their closest equivalent in the current codepage—but the program won't find the file, then.

ETA: This can't be the issue, apparently. Both your code and my testing code which iterates over dir output process 300k files on both Windows Server 2k3 R2, 32 bit and Windows 7.

Joey
+1: I had the same idea and tested it for some 300K files. Works like expected.
Frank Bollack
You'll get even better performance, if you write the output of the `DIR` command to a file and then process its content.
Frank Bollack
@Frank and @Johannes; you're both right. It turns out the root cause is an issue with the built-in Windows "extract" command for ZIP files. The files I have to process are in a ZIP file; my server doesn't have a ZIP utility installed, just the native Windows commands. Apparently "Extract All" only extracts the first 65,535 files. I installed 7-zip, unzipped all the files, and my batch script worked as intended.
Patrick Cuff
I'm going to accept this as the answer; wish I could give @Frank some credit as well. Thanks for looking into this guys, you steered me in a direction that helped resolve my issue.
Patrick Cuff
@Patrick: Well, I have seen someone once giving random upvots on another person's questions/answers for giving a helpful answer in a comment.
Joey