views:

57

answers:

1

I am writing a Word add-in which is supposed to store some own XML data per document using Word object model and its CustomXMLPart. The problem I am now facing is the lack of IStream-like functionality for reading/writing XML to/from a CustomXMLPart. It only provides BSTR interface and I am puzzled how to handle UTF-8 XMLs with BSTRs. To my understanding an UTF-8 XML file should really never have to undergo this sort of Unicode conversion. I am not sure what to expect as a result here.

Is there another way of using Word automation interfaces to store arbitrary custom information inside a DOCX file?

+1  A: 

The "package" is an OPC document (Open Packaging Convention), which is basically a structured zip folder with a different extension (e.g. .pptx, .docx, .xps, etc.). You can get that file in stream and manipulate it any which way you like - but not artibitrarily. It will not be recognized as valid docx if you put things in the wrong places (not just xml elements, but also files in the folders inside the zip file). But if you're just talking "artibitrary" meaning CustomXMLPart, then that's okay.

This is a good kicker page to learn more about the Open XML SDK and if you're up to it, which allows for somewhat easier access to the file formats than using (.NET) System.IO.Packaging or a third-party zip library. To go deeper, grab the eBook (free) Open XML Explained.

With the Open XML SDK (again, this can all be done without the SDK) in .NET, this is what you'll want to do: How to: Insert Custom XML to an Office Open XML Package by Using the Open XML API.

Otaku
Thanks a lot! I will take a look at all that soon. In the meantime, there is just one issue. I have managed to make things work with `CustomXMLParts` but it was pain. Because of this `BSTR` issue I kept my XML plain ASCII and encoded non-ASCII content using `Base64` encoding.Will your approach with Open XML SDK work from a Word add-in? Please bear in mind that DOCX file is being held open by Word. Does Open XML SDK allow for manipulation in this scenario? Thanks!
wpfwannabe
Ah, it wouldn't work if the document is currently open (unless you open from Sharepoint). I don't know much about `BSTR`, but I'd venture to guess that the byte sizes are different between your variable in C++ and what the Primary Interop is expecting.
Otaku
Yes, you can manipulate an open Word document using the OpenXML SDK via a VSTO add-in. It works well. There is an MSDN blog post somewhere which explains how to do it.
plutext
@plutext: please post the link on MSDN.
Otaku
http://blogs.msdn.com/ericwhite/archive/2010/02/02/increasing-performance-of-word-automation-for-large-amount-of-data-using-open-xml-sdk.aspx
plutext
@plutext: that's great, i knew about Flat OPC, just didn't know it could be injected into an open document via the object model. great stuff!
Otaku