views:

262

answers:

0

I have tested a number of solutions to capture the HTML content of a CWebBrowser2 element into a vector metafile. I can get either part of the web page as vector emf, or I can get all of the webpage as a raster bitblt wrapped in an emf wrapper. What I want is All of the webpage as vector with only original bitmaps, flash, etc represented as bitmaps. I'm doing this with a hidden webBrowser form.

Here is what I have tried: IViewObject::Draw() and OleDraw() -- with enhanced metafile handles. These give me all of the webpage but as a bitmap.

IHTMLElementRender::DrawToDC() - with enhanced metafile handles. This gives me some of the webpage, often the non transparent, non flash portions. The part I do get is vector EMF records.

Printing: ::SendMessage(m_pBrowserWnd, WM_PRINT, (WPARAM)(pd.hDC), PRF_NONCLIENT); m_pBrowserWnd.Print(&tmpDc, PRF_CLIENT); ::PrintWindow(m_pBrowserWnd, hDC, 0); IHTMLDocument2 *pHtmlDoc->execCommand(L"Print", false, var, NULL );

All of these (except the first) give me a nice vector EMF-based spool file that goes off to the printer. I don't want to create a virtual printer driver to capture and process the spool file, I want either all of the EMF records on one page, or preferably a stream of EMFs that I can process further.

I'm thinking that the PrintProcessor stuff might be useful, but I don't know. I'm beginning to think that Mozzilla's Gecko rendering engine is the way to go.

My preferred solution language is C++, but C# or VB would be acceptable too.

Test pages include: http://www.youtube.com http://www.alltop.com http://www.apple.com http://www.microsoft.com http://www.adobe.com http://www.canadiantire.com

-Jason