views:

561

answers:

5

I've got a third-party component that does PDF file manipulation. Whenever I need to perform operations I retrieve the PDF documents from a document store (database, SharePoint, filesystem, etc.). To make things a little consistent I pass the PDF documents around as a byte[].

This 3rd party component expects a MemoryStream[] (MemoryStream array) as a parameter to one of the main methods I need to use.

I am trying to wrap this functionality in my own component so that I can use this functionality for a number of areas within my application. I have come up with essentially the following:

public class PdfDocumentManipulator : IDisposable
{
   List<MemoryStream> pdfDocumentStreams = new List<MemoryStream>();

   public void AddFileToManipulate(byte[] pdfDocument)
   {
      using (MemoryStream stream = new MemoryStream(pdfDocument))
      {
         pdfDocumentStreams.Add(stream);
      }
   }

   public byte[] ManipulatePdfDocuments()
   {
      byte[] outputBytes = null;

      using (MemoryStream outputStream = new MemoryStream())
      {
           ThirdPartyComponent component = new ThirdPartyComponent();
           component.Manipuate(this.pdfDocumentStreams.ToArray(), outputStream);

           //move to begining
           outputStream.Seek(0, SeekOrigin.Begin);

           //convert the memory stream to a byte array
           outputBytes = outputStream.ToArray();
      }

      return outputBytes;
   }

   #region IDisposable Members
   public void Dispose()
   {
       for (int i = this.pdfDocumentStreams.Count - 1; i >= 0; i--)
       {
          MemoryStream stream = this.pdfDocumentStreams[i];
          this.pdfDocumentStreams.RemoveAt(i);
          stream.Dispose();
       }
   }
   #endregion
}

The calling code to my "wrapper" looks like this:

    byte[] manipulatedResult = null;
    using (PdfDocumentManipulator manipulator = new PdfDocumentManipulator())
    {
        manipulator.AddFileToManipulate(file1bytes);
        manipulator.AddFileToManipulate(file2bytes);
        manipulatedResult = manipulator.Manipulate();
    }

A few questions about the above:

  1. Is the using clause in the AddFileToManipulate() method redundant and unnecessary?
  2. Am I cleaning up things OK in my object's Dispose() method?
  3. Is this an "acceptable" usage of MemoryStream? I am not anticipating very many files in memory at once...Likely 1-10 total PDF pages, each page about 200KB. App designed to run on server supporting an ASP.NET site.
  4. Any comments/suggestions?

Thanks for the code review :)

+2  A: 

It looks to me like you misunderstand what Using does.

It's just syntactic sugar to replace

MemoryStream ms;
try
{
ms = new MemoryStream();
}
finally
{
ms.Dispose();
}

Your usage in AddFileToManipulate is redundant. I'd set up the list of memorystreams in the constructor of PdfDocumentManipulator, then have PdfDocumentManipulator's dispose method call dispose on all the memorystreams.

WOPR
Thanks, I get what using is "used" for.
Brian
+2  A: 

Side note. This really seems like it calls for an extension method.

public static void DisposeAll<T>(this IEnumerable<T> enumerable)
  where T : IDisposable {
  foreach ( var cur in enumerable ) { 
    cur.Dispose();
  }
}

Now your Dispose method becomes

public void Dispose() { 
  pdfDocumentStreams.Reverse().DisposeAll();
  pdfDocumentStreams.Clear();
}

EDIT

You don't need the 3.5 framework in order to have extension methods. They will happily work on the 3.0 compiler down targeted to 2.0

http://blogs.msdn.com/jaredpar/archive/2007/11/16/extension-methods-without-3-5-framework.aspx

JaredPar
Sorry, forgot to mention .NET 2.0. No extension methods. :(
Brian
Also, the reason for reversing order is to pull items out.
Brian
Did not know that about Extension Methods...Will have to look at that a bit more.
Brian
+4  A: 

AddFileToManipulate scares me.

   public void AddFileToManipulate(byte[] pdfDocument)
   {
      using (MemoryStream stream = new MemoryStream(pdfDocument))
      {
         pdfDocumentStreams.Add(stream);
      }
   }

This code is adding a disposed stream to your pdfDocumentStream list. Instead you should simply add the stream using:

   pdfDocumentStreams.Add(new MemoryStream(pdfDocument));

And dispose of it in the Dispose method.

Also you should look at implementing a finalizer to ensure stuff gets disposed in case someone forgets to dispose the top level object.

Sam Saffron
Yeah, that's what I was thinking. But since it was added to the list, it's reference count should have been incremented, and as a result, not disposed? Or is Dispose ALWAYS called at the end of a using statement?
Brian
yerp see Kieran's answer
Sam Saffron
I don't think there's any point implementing a finalizer because MemoryStream will already implement its own if it needs it.
Dave
True there is no need, but ... just in case you change your implementation down the line you dont need to remember that fact about memory streams, if the finalizer is implemented properly it will have no performance impact
Sam Saffron
Brian: Dispose is always called at the end of a using statement. Dispose is distinct from garbage collection - calling dispose does NOT collect your object, it just gives you a change to clean up native resources.
Reed Copsey
+1  A: 
  1. Is the using clause in the AddFileToManipulate() method redundant and unnecessary?

Worse, it's destructive. You're basically closing your memory stream before it's added in. See the other answers for details, but basically, dispose at the end, but not any other time. Every using with an object causes a Dispose to happen at the end of the block, even if the object is "passed off" to other objects via methods.

  1. Am I cleaning up things OK in my object's Dispose() method?

Yes, but you're making life more difficult than it needs to be. Try this:

foreach (var stream in this.pdfDocumentStreams)
{
    stream.Dispose();
}
this.pdfDocumentStreams.Clear();

This works just as well, and is much simpler. Disposing an object does not delete it - it just tells it to free it's internal, unmanaged resources. Calling dispose on an object in this way is fine - the object stays uncollected, in the collection. You can do this and then clear the list in one shot.

  1. Is this an "acceptable" usage of MemoryStream? I am not anticipating very many files in memory at once...Likely 1-10 total PDF pages, each page about 200KB. App designed to run on server supporting an ASP.NET site.

This depends on your situation. Only you can determine whether the overhead of having these files in memory is going to cause you problems. This is going to be a fairly heavy-weight object, though, so I'd use it carefully.

  1. Any comments/suggestions?

Implement a finalizer. It's a good idea whenever you implement IDisposable. Also, you should rework your Dispose implementation to the standard one, or mark your class as sealed. For details on how this should be done, see this article. In particular, you should have a method declared as protected virtual void Dispose(bool disposing) that your Dispose method and your finalizer both call.

Reed Copsey
Thanks much. Good explanations and appreciate the time/effort taken to communicate them!!
Brian
Nice catch on sealed. Many of the devs on this project wouldn't even think to create an inheritance chain anyway. And yes, this object will be used with care. It only serves one purpose, which is to basically wrap functionality that would otherwise be written in a "procedural" way.
Brian
Also, have removed the using statement in the AddFileToManipulate() method.
Brian
Glad to hear I could help.
Reed Copsey
A: 

You only have to properly dispose objects that holds native resources. MemoryStream does not hold native resources.

AndreasN