views:

391

answers:

3

Our daily feed file averages 2 GB in size. These files get archived to a single zip file at the end of each month and stored in a network share. From time to time, I have a need to search for certain records in those files. I do this by connecting by remote desktop to the shared server, unzip the files to a temp folder, run grep (or PowerShell) search, and then delete the temp folder. Now, because our server is running low in disk space, it is no longer recommeded to unzip them all to a temp folder. What is an efficient way to do a regex search on those zipped files with minimum impact on disk or network resources?

+1  A: 

There are some zip related commandlets in the Powershell Community Extensions (PSCX). I don't think they would do what you want however (I could be entirely wrong about that though). Instead I would use .Net Zip Library (DotNetZip) which allows you to essentially list the names of the files in an archive then extract just the ones you want.

EBGreen
+6  A: 

zgrep on Linux. If you're on Windows, you can download GnuWin which contains a Windows port of zgrep.

Mark
And just for clarity, it searches within “regular” zip files, as well as `gzip` files.
Nate
A: 

The PowerShell Community Extensions (PSCX) include Read-Archive and Expand-Archive cmdlets, but don't (yet?) include a navigation provider which would make what you want very simple. That said, you could use Read-Archive and Expand-Archive. Something like this untested bit

Read-Archive -Path foo.zip -Format Zip | `
   Where-Object { $_.Name -like "*.txt" } | `
      Expand-Archive -PassThru | select-string "myRegex"

would let you search without extracting the entire archive.

Scott Weinstein