views:

174

answers:

1

Using C# how should I go about extracting titles subtitles and paragraphs from a docx document.

I am thinking of doing this through VSTO but do know know the word object model. I am only familiar with the Excel object model.

Should I take the unzip + linq to XML approach ?

Using VSTO i could build an addin which could be used to edit the application where I would convert to and from docx.

does anyone have prior experiences with this kind of thing? any leads will be greatly appreciated.

+1  A: 

Personally I'd take the unzip + LINQ2XML approach. (You can unzip using the built-in support in the framework or if you are using an old version you can use the zip library provided by icsharpcode.net

I'd take this approach because for something as simple as this I'd rather not depend on VSTO. This way the end user doesn't even need to have Office installed. (And there are no other license issues... of which I don't know the details).

Just my opinion.

TimothyP