views:

324

answers:

5

How to read doc, docx file into .NET with C#.

Pls give some advice to do this.

A: 

Aspose.Words for .NET is a commercial library that allows you to do exactly this. From the website:

Using Aspose.Words for .NET, developers can easily open and save DOC, OOXML, RTF, WordprocessingML, HTML, MHTML, TXT and OpenDocument documents.

Jørn Schou-Rode
+2  A: 

Generally a COM interop is used to interface with office documents.

Here's an example on MSDN on creating an excel file, it should give you an idea.

http://msdn.microsoft.com/en-us/library/ms173186(VS.80).aspx

Also, Visual Studio 2010 along with .net 4.0 will include more dynamic language features which lend themselves to doing office com interop, read more here

http://blogs.msdn.com/samng/archive/2009/06/16/com-interop-in-c-4-0.aspx

And here's a video

http://msdn.microsoft.com/en-us/vcsharp/ee460939.aspx

TJB
+1  A: 

Microsoft provide a free set of interop assemblies for interacting with the various Office file formats in .NET, the download locations differ depending on which version of Office you are using but a Google search for "Microsoft Office Primary Interop Assemblies" will yield the links for various versions from MSDN such as this one for Office 2007.

As for how to open a Word document (doc or docx) using these interops the following snippet shows how to open a Word document:

_Application WordApp = new Microsoft.Office.Interop.Word.Application();

  object WordFile = "C:\\SomeDoc.doc";
  object RdOnly = false;
  object Visible = true;
  object Missing = System.Reflection.Missing.Value;
  Document Doc = WordApp.Documents.Open(ref WordFile, ref Missing, ref RdOnly, ref Missing, ref Missing, 
                                        ref Missing, ref Missing, ref Missing, ref Missing, ref Missing, 
                                        ref Missing, ref Visible, ref Missing, ref Missing, ref Missing, 
                                        ref Missing);

From there you can use Doc to access various parts of the document.

lee-m
Oh god ! there is nothing i hate more than the "System.Reflection.Missing.Value" !!
Yassir
A: 

you can simply use the RichTextBox control to read .rtf and .doc files using RichTextBox.Load method

Microgen
A: 

I see you used the asp.net tag. You should not use the automation API (COM Interop) to run Microsoft Office products from ASP.NET or any other server application. The Office products are made to be run from the desktop - with a user interface. They don't work properly in a server scenario, and additionally, there are licensing issues.

Use Aspose.Words for .NET or some other such technology instead. They are designed to be used in a server environment.

John Saunders