views:

21

answers:

1

The Setup

I have a PDF API which has a native function that is defined below.

typdef void* PDF_DOCUMENT;
unsigned long PDF_GetMetaText(PDF_DOCUMENT document,
                              const char tag, 
                              void* buffer, 
                              unsigned long bufferlen)

//Calling it "natively" in C++/CLI function to get the PDF Creator tag
WCHAR result[32];
void* pdoc = PDF_LoadDoc("C:\test.pdf");
int numChars = PDF_GetMetaText(pdoc, "Creator", result, 32);
PDF_CloseDoc(pdoc);

if I call the above code in my C++/CLI wrapper function, it returns the correct string but throws an AccessViolationException when I call PDF_CloseDoc. WOOPS. I forgot to pin_ptr the pointer the document.

The Problem

When I pin_ptr pdoc, i can successfully call these native functions, however the buffer no longer contains my string when PDF_GetMetaText returns.

String^ Wrapper::GetCreator(String^ filename)
{
   WCHAR buffer[32];
   void *pdoc = PDF_LoadDoc(SystemStringToCStr(filename));
   pin_ptr<void*> p = &pdoc; //added
   int numPages = PDF_GetMetaText(p, "Creator", buffer, 32);
   PDF_CloseDocument(p); //doesnt crash, but at this line buffer is an empty string

   return gcnew String(buffer);
}

I have also tried pinning buffer[0] but that causes an accessviolation exception at GetMetaText.

The Question

I cant say what is happening in GetMetaText, so I am not sure what is happing to pdoc. Any suggestions to the above code?

A: 

This doesn't make any sense. You can only pin managed objects, the return value of PDF_LoadDoc() sure doesn't look like a managed object to me. Same goes for result, it is not a managed array<WCHAR>, just a plain vanilla C array that gets allocated on the stack frame. Unfortunately, pin_ptr<> doesn't complain about this.

The result array could only get 'empty' if code is stomping the stack frame. Which you can diagnose by setting a data breakpoint on the first element. Fwiw, SystemStringToCStr() looks like a candidate. This cannot work without releasing the buffer for the native string somewhere. Another candidate is the PDF API function declarations. Pay attention to the value of the ESP register and make sure it doesn't change. If it does, the stack is get imbalanced because you don't have the proper calling convention. Which is usually __stdcall for DLL exports.

Hans Passant
This is a great answer that I am about to accept but I have one question. In visual studio 2008, how do I see what is in the ESP register when the function is called? PDF_GetMetaText is a function pointer that I assign using GetProcAdress. I explicitly declare it with __stdcall.
Tom Fobear
Debug + Windows + Registers. This is definitely quacking like a calling convention mismatch.
Hans Passant
What I think is the right answer: Its a bug in the API. When I try to parse the PDF in iTextSharp (open source library) It throws "InvalidPdfException" when it tries to read the pdfs XRef section. When I put "good" pdfs through PDF_GetMetaText it works fine. When the pdf is "bad" it changes the adress pdoc points to, but my WCHAR buffer gets filled out with a proper string. Also, pdoc's address changes after a bad call. I am going to mark this as answered, thank you for pushing me in the right direction.
Tom Fobear