




I am using Word and OpenXml to provide mail merge functionality in a C# ASP.NET web application:

1) A document is uploaded with a number of pre-defined strings for substitution.

2) Using the OpenXML SDK 2.0 I open the Word document, get the mainDocumentPart as a string and perform the substitution using Regex.

3) I then create a new document using OpenXML, add a new mainDocumentPart and insert the string resulting from the substitution into this mainDocumentPart.

However, all formatting/styles etc. are lost in the new document.

I'm guessing I can copy and add the Style, Definitions, Comment parts etc.. individually to mimic the orginal document.

However is there a method using Open XML to duplicate a document allowing me to perform the substitutions on the new copy?



When you look at an openxml document by changing the extension to zip and opening it you see that that word subfolder contains a _rels folder where all the relations are listed. These relations point to the parts you mentioned (style ...). Actually you need these parts because they contain the definition of the formatting. So not copying them will cause the new document to use the formatting defined in the normal.dot file and not the one defined in the original document. So I think you have to copy them.

not really answering the question. read up on how to do it before answering.
Anonymous Type

The OpenXML SDK doesn't provide a "SaveAs" type of method, if that's what you were expecting. The way around that is to make a copy of your original document, perform your modifications on that, and use it as a result. In other words if the original document has the styles you want and you're simply changing values then make a copy of it and use it as your "template" to retain the styles.

I'm a little unclear on what's really going on in step 2. Is it a complete string representation of the XML layout? A new document won't have the styles used in the original if they are customized. This follows what @crauscher mentioned. It would be more straightforward to manipulate a copy of the document using an XML approach with the help of the OpenXML SDK.

Ahmad Mageed
there is actually a tool from MS that you can use to spit out C#/VB code that will auto generate documents, breaking open the zip is unnecessary unless your doing custom ribbon development.
Anonymous Type
+1  A: 

I have done some very similar things, but instead of using text substitution strings, I use Word Content Controls. I have documented some of the details in the following blog post, SharePoint and Open Xml. The technique is not specific to SharePoint. You could reuse the pattern in pure ASP.NET or other applications.

Also, I would STRONGLY encourage you to review Eric White's Blog for tips, tricks and techniques regarding Open Xml. Specifically, check out the in-memory manipulation of Open Xml post, and the Word content controls posts. I think you'll find these much more helpful in the long run.

Hope this helps.

Pete Skelly
+1  A: 

I second the use of Content Controls recommendation. Using them to mark up the areas of your document where you want to perform substitution is by far the easiest way to do it.

As for duplicating the document (and retaining the entire document contents, styles and all) it's relatively easy:

string documentURL = "full URL to your document";
byte[] docAsArray = File.ReadAllBytes(documentURL);

using (MemoryStream stream = new MemoryStream)
    stream.Write(docAsArray, 0, docAsArray.Length);    // THIS performs doc copy
    using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, true))
        // perform content control substitution here, making sure to call .Save()
        // on any documents Part's changed.
    File.WriteAllBytes("full URL of your new doc to save, including .docx", stream.ToArray());

Actually finding the content controls is a piece of cake using LINQ. The following example finds all the Simple Text content controls (which are typed as SdtRun):

using (WordprocessingDocument doc = WordprocessingDocument.Open(stream, true))
    var mainDocument = doc.MainDocumentPart.Document;
    var contentControls = from sdt in mainDocument.Descendants<SdtRun>() select sdt;

    foreach (var cc in contentControls)
        // drill down through the containment hierarchy to get to 
        // the contained <Text> object
        cc.SdtContentRun.GetFirstChild<Run>().GetFirstChild<Text>().Text = "my replacement string";

The and elements may not already exist but creating them is a simple as:

cc.SdtContentRun.Append(new Run(new Text("my replacement string")));

Hope that helps someone. :D

+1  A: 

As an addenda to the above; what's perhaps more useful is finding content controls that have been tagged (using the word GUI). I recently wrote some software that populated document templates that contained content controls with tags attached. To find them is just an extension of the above LINQ query:

var mainDocument = doc.MainDocumentPart.Document;
var taggedContentControls = from sdt in mainDocument.Descendants<SdtElement>()
                            let sdtPr = sdt.GetFirstChild<SdtProperties>()
                            let tag = (sdtPr == null ? null : sdtPr.GetFirstChild<Tag>())
                            where (tag != null)
                            select new
                                SdtElem = sdt,
                                TagName = tag.GetAttribute("val", W).Value

I got this code from elsewhere but cannot remember where at the moment; full credit goes to them.

The query just creates an IEnumerable of an anonymous type that contains the content control and its associated tag as properties. Handy!


This piece of code should copy all parts from an existing document to a new one.

using (var mainDoc = WordprocessingDocument.Open(@"c:\sourcedoc.docx", false))
using (var resultDoc = WordprocessingDocument.Create(@"c:\newdoc.docx",
  // copy parts from source document to new document
  foreach (var part in mainDoc.Parts)
    resultDoc.AddPart(part.OpenXmlPart, part.RelationshipId);
  // perform replacements in resultDoc.MainDocumentPart
  // ...

Is there a way to copy entire table from one page to another? I have a table with content control setup and i want to copy that table on multiple page so that I can fill in information coming from the database.

Thanks in anticipaiton of your help.


this isn't an answer, try posting a seperate question if you want help.
Anonymous Type