views:

356

answers:

2

I'm trying to embed a PDF file into a Word document using the OLE technique described here: http://blogs.msdn.com/brian_jones/archive/2009/07/21/embedding-any-file-type-like-pdf-in-an-open-xml-file.aspx

I've tried to implement the C++ code provided in C# so that the whole project's in one place and am almost there except for one roadblock. When I try to feed the generated OLE object binary data into the Word document I get an IOException.

IOException: The process cannot access the file 'C:\Wherever\Whatever.pdf.bin' because it is being used by another process.

There is a file handle open the .bin file ("oleOutputFileName" below) and I don't know how to get rid of it. I don't know a huge amount about COM - I'm winging it here - and I don't know where the file handle is or how to release it.

Here's what my C#-ised code looks like. What am I missing?

    public void ExportOleFile(string oleOutputFileName, string emfOutputFileName)
    {
        OLE32.IStorage storage;
        var result = OLE32.StgCreateStorageEx(
            oleOutputFileName,
            OLE32.STGM.STGM_READWRITE | OLE32.STGM.STGM_SHARE_EXCLUSIVE | OLE32.STGM.STGM_CREATE | OLE32.STGM.STGM_TRANSACTED,
            OLE32.STGFMT.STGFMT_DOCFILE,
            0,
            IntPtr.Zero,
            IntPtr.Zero,
            ref OLE32.IID_IStorage,
            out storage
        );

        var CLSID_NULL = Guid.Empty;

        OLE32.IOleObject pOle;
        result = OLE32.OleCreateFromFile(
            ref CLSID_NULL,
            _inputFileName,
            ref OLE32.IID_IOleObject,
            OLE32.OLERENDER.OLERENDER_NONE,
            IntPtr.Zero,
            null,
            storage,
            out pOle
        );

        result = OLE32.OleRun(pOle);

        IntPtr unknownFromOle = Marshal.GetIUnknownForObject(pOle);
        IntPtr unknownForDataObj;
        Marshal.QueryInterface(unknownFromOle, ref OLE32.IID_IDataObject, out unknownForDataObj);
        var pdo = Marshal.GetObjectForIUnknown(unknownForDataObj) as IDataObject;

        var fetc = new FORMATETC();
        fetc.cfFormat = (short)OLE32.CLIPFORMAT.CF_ENHMETAFILE;
        fetc.dwAspect = DVASPECT.DVASPECT_CONTENT;
        fetc.lindex = -1;
        fetc.ptd = IntPtr.Zero;
        fetc.tymed = TYMED.TYMED_ENHMF;

        var stgm = new STGMEDIUM();
        stgm.unionmember = IntPtr.Zero;
        stgm.tymed = TYMED.TYMED_ENHMF;
        pdo.GetData(ref fetc, out stgm);

        var hemf = GDI32.CopyEnhMetaFile(stgm.unionmember, emfOutputFileName);
        storage.Commit((int)OLE32.STGC.STGC_DEFAULT);

        pOle.Close(0);
        GDI32.DeleteEnhMetaFile(stgm.unionmember);
        GDI32.DeleteEnhMetaFile(hemf);
    }

UPDATE 1: Clarified which file I meant by "the .bin file".
UPDATE 2: I'm not using "using" blocks because the things I want to get rid of aren't disposable. (And to be perfectly honest I'm not sure what I need to release to remove the file handle, COM being a foreign language to me.)

A: 

I see at least four potential refcount leaks in your code:

OLE32.IStorage storage; // ref counted from OLE32.StgCreateStorageEx(
IntPtr unknownFromOle = Marshal.GetIUnknownForObject(pOle); // ref counted
IntPtr unknownForDataObj; // re counted from Marshal.QueryInterface(unknownFromOle
var pdo = Marshal.GetObjectForIUnknown(unknownForDataObj) as IDataObject; // ref counted

Note that all these are pointers to COM objects. COM objects are not collected by GC unless the .Net type that holds the reference points to an RCW wrapper and will properly release its reference count in its finalizer.

IntPtr is not such type and your var also is IntPtr (from the return type of the Marshal.GetObjectForIUnknown call), so that makes three.

You should call Marshal.Release on all your IntPtr variables.

I am not sure about OLE32.IStorage. This one might need either Marshal.Release or Marshal.ReleaseComPointer.

Update: I just noticed that I missed at least one ref count. The var is not an IntPtr, it's an IDataObject. The as cast will do an implicit QueryInterface and add another ref count. Although GetObjectForIUnknown returns an RCW, this one is delayed until the GC kicks in. You might want to do this in using block to activate the IDisposable on it.

Meanwhile, the STGMEDIUM struct also has one IUnknown pointer you are not releasing. You should call ReleaseStgMedium to properly dispose of the whole struct, including that pointer.

I am too tired to continue looking through the code right now. I'll come back tomorrow and try to find other possible ref count leaks. Meanwhile, you check the MSDN docs for all interfaces, structs and APIs you are calling and try to figure out any other ref counts you might have missed.

Franci Penov
I have added the following to the end of the method but still get the IOException. Marshal.ReleaseComObject(pdo); Marshal.Release(unknownForDataObj); Marshal.Release(unknownFromOle); Marshal.ReleaseComObject(storage);Does the order matter?When this runs the refcounts returned are not zero, they are 1 (for pdo), 4 (for unknownForDataObj), 3 (for unknownFromOle), and 0 (for storage).
Bernard Darnton
I have to ask why you didn't just do `var pdo = pOle as IDataObject;`? Typecasting for RCWs uses `IUnknown.QueryInterface()`.
Simon Buchan
A: 

I found the answer and it's pretty simple. (Probably too simple - it feels like a hack but since I know so little about COM programming I'm just going to go with it.)

The storage object had multiple references on it, so just keep going until they're all gone:

var storagePointer = Marshal.GetIUnknownForObject(storage);
int refCount;
do
{
    refCount = Marshal.Release(storagePointer);
} while (refCount > 0);
Bernard Darnton