views:

754

answers:

3

I'm reading a .gz file from some slow source (like FTP Server) and am processing the received data right away. Looks something like this:

FtpWebResponse response = ftpclientRequest.GetResponse() as FtpWebResponse;
using (Stream ftpStream = response.GetResponseStream())
using (GZipStream unzipped = new GZipStream(ftpStream, CompressionMode.Decompress))
using (StreamReader linereader = new StreamReader(unzipped))
{
  String l;
  while ((l = linereader.ReadLine()) != null)
  {
    ...
  }
}

My problem is showing an accurate progress bar. In advance I can get the compressed .gz file size, but I got no clue how large the content would be uncompressed. Reading the file line by line I know quite well how many uncompressed bytes I read, but I don't know how this does relate to the compressed file size.

So, is there any way to get from GZipStream how far the file pointer is advanced into the compressed file? I only need the current position, the gz file size I can fetch before reading the file.

A: 

I suggest you to take a look at the following code :

public static readonly byte[] symbols = new byte[8 * 1024];

public static void Decompress(FileInfo inFile, FileInfo outFile)
{
    using (var inStream = inFile.OpenRead())
    {
        using (var zipStream = new GZipStream(inStream, CompressionMode.Decompress))
        {
            using (var outStream = outFile.OpenWrite())
            {
                var total = 0;
                do
                {
                    var async = zipStream.BeginRead(symbols, 0, symbols.Length, null, null);
                    total = zipStream.EndRead(async);
                    if (total != 0)
                    {
                        // Report progress. Read total bytes (8K) from the zipped file.
                        outStream.Write(symbols, 0, total);
                    }
                } while (total != 0);
            }
        }
    }
}
Petar Petrov
Excessive and unnecessary use of the var-keyword. Makes the code really unreadable.
BeowulfOF
'var' sure makes it easy to type out an example and let the compiler work it out.
kenny
I'm sorry, I don't understand how this would help me to compute how far into the gz file I am. 'total' is containing the uncompressed progress, which does not help me, since I got no clue how big the file is when not compressed. I need to know my position in **compressed** bytes.
Sam
A: 

I've revisited my code and I've performed some tests. IMHO darin is right. However I think it's possible to read only the header (size ?) of the zipped stream and find out the resulting file size. (WinRar "knows" what's the unzipped file size without unzipping the entire zip archive. It reads this information from archive's header.) If you find the resulting file size this code will help you to report a precise progress.

public static readonly byte[] symbols = new byte[8 * 1024];

public static void Decompress(FileInfo inFile, FileInfo outFile, double size, Action<double> progress)
{
    var percents = new List<double>(100);

    using (var inStream = inFile.OpenRead())
    {
        using (var zipStream = new GZipStream(inStream, CompressionMode.Decompress))
        {
            using (var outStream = outFile.OpenWrite())
            {
                var current = 0;

                var total = 0;
                while ((total = zipStream.Read(symbols, 0, symbols.Length)) != 0)
                {
                    outStream.Write(symbols, 0, total);
                    current += total;

                    var p = Math.Round(((double)current / size), 2) * 100;
                    if (!percents.Contains(p))
                    {
                        if (progress != null)
                        {
                            progress(p);
                        }
                        percents.Add(p);
                    }
                }
            }
        }
    }
}

I hope this helps.

Petar Petrov
Petar, as in your first example the current position in the uncompressed file is correct, still, it is of no use for me since I do not know the uncompressed file size. I don't think GZip does store the file size as Rar does, so there is no way for me to get the uncomp. size.
Sam
+1  A: 

You can plug a stream in between which counts, how many bytes GZipStream has read.

  public class ProgressStream : Stream
  {
    public long BytesRead { get; set; }
    Stream _baseStream;
    public ProgressStream(Stream s)
    {
      _baseStream = s;
    }
    public override bool CanRead
    {
      get { return _baseStream.CanRead; }
    }
    public override bool CanSeek
    {
      get { return false; }
    }
    public override bool CanWrite
    {
      get { return false; }
    }
    public override void Flush()
    {
      _baseStream.Flush();
    }
    public override long Length
    {
      get { throw new NotImplementedException(); }
    }
    public override long Position
    {
      get
      {
        throw new NotImplementedException();
      }
      set
      {
        throw new NotImplementedException();
      }
    }
    public override int Read(byte[] buffer, int offset, int count)
    {
      int rc = _baseStream.Read(buffer, offset, count);
      BytesRead += rc;
      return rc;
    }
    public override long Seek(long offset, SeekOrigin origin)
    {
      throw new NotImplementedException();
    }
    public override void SetLength(long value)
    {
      throw new NotImplementedException();
    }
    public override void Write(byte[] buffer, int offset, int count)
    {
      throw new NotImplementedException();
    }
  }

// usage
FtpWebResponse response = ftpclientRequest.GetResponse() as FtpWebResponse;
using (Stream ftpStream = response.GetResponseStream())
using (ProgressStream progressStream = new ProgressStream(ftpstream))
using (GZipStream unzipped = new GZipStream(progressStream, CompressionMode.Decompress))
using (StreamReader linereader = new StreamReader(unzipped))
{
  String l;
  while ((l = linereader.ReadLine()) != null)
  {
    progressStream.BytesRead(); // does contain the # of bytes read from FTP so far.
  }
}
Lars
Great, this was what I'm looking for!Pity the Ftp-Stream does not support returning the bytes already read!
Sam