tags:

views:

43

answers:

1

Hello,

I have a .gz file that contains an XML document. I was wondering anyone knew how to use Zlib properly. So far, I have the following code:

require 'zlib'
Zlib::GzipReader.open('PRIDE_Exp_Complete_Ac_1015.xml.gz') { |gz|
    g = File.new("PRIDE_Exp_Complete_Ac_1015.xml", "w")
      g.write(gz)
      g.close()
}

But this creates a blank .xml document. Does anyone know how I can properly do this?

Thanks

+2  A: 

Zlib::GzipReader works like most IO-like classes do in Ruby. You have an open call, and when you pass a block to it, the block will receive the IO-like object. Think of it is convenient way of doing something with a file or resource for the duration of the block.

But that means that in your example gz is an IO-like object, and not actually the contents of the gzip file, as you expect. You still need to read from it to get to that. The simplest fix would then be:

g.write(gz.read)

Note that this will read the entire contents of the uncompressed gzip into memory.

If all you're really doing is copying from one file to another, you can use the more efficient IO.copy_stream method. Your example might then look like:

Zlib::GzipReader.open('PRIDE_Exp_Complete_Ac_1015.xml.gz') do |gz|
  File.new("PRIDE_Exp_Complete_Ac_1015.xml", "w") do |g|
    IO.copy_stream(gz, g)
  end
end

Behind the scenes, this will try to use the sendfile syscall available in some specific situations on Linux. Otherwise, it will do the copying in fast C code 16KB blocks at a time. This I learned from the Ruby 1.9.1 source code.

Shtééf