I would like to convert doc/docx documents to semantic HTML.
Some wishes/requirements:
Semantic HTML such that headers in the document are <h1>, <h2> etc., tables are <table> and so forth.
Should preferably be possible to handle headings, lists, tables and images. Graphs and math formulas is a nice extra.
• Doesn't have to be conver...
In my web project, I use DocX file type for containing report template. I need to convert DocX file type to PDF. Do you have any .Net managed code for doing that?
I know several ways for solving this question. But it isn't managed code and free like the following items.
Word 12.0 Object Library To programmatically save a Word 2007 doc...
I have a "template" docx document which contains the desired layout, and wish to insert content using C#, but I cannot find a way to uniquely address specific sections of the document, such as paragraphs or tables.
What is the best way to uniquely identify elements in the document?
Thanks,
Matt Sharpe.
...
On September 28, 2009 the Apache POI project released version 3.5 which officially supports the OOXML formats introduced in Office 2007, like DOCX and XLSX.
Please provide a code sample for extracting a DOCX file's content in plain text, ignoring any styles or formatting.
I am asking this because I have been unable to find any Apache P...
I'm using Doxygen to build an API library from C# source code. Doxygen generates a library of TEX files.
My client has asked for a PDF version of this API library so I need to convert the TEX file library into a single PDF or DOCX.
I've been looking into tools such as LyX, OpenOffice, and ProText but still haven't found a solution.
...
I'm using CVSNT. I added a Microsoft 2007 docx file "as text" to the repository. After committing and before updating I tried to open the file again but was unable to. It said it was corrupt.
I tried using the office word doc recovery and that was unable to recover the document.
From what I understand I should've added the word doc a...
Good morning,
does anyone know of a (native) .net way to convert xps documents to docx or finally to a normal (non wordml) .doc? As in not using office automation and rather some native (3rd party) .net library that might help me there?
Basically the xps > doc transformation will take place on a server with multiple concurrently runnin...
I have been trying to write a simple Markdown -> docx parser/writer, but am completely stuck with the last part, which should be the easiest: i.e. compressing the folder into a .docx that Word, or any other .docx reader, will recognize.
My parser-writer is irrelevant really: I have this problem if I simply unzip any old Word-produced...
Hello there,
I understand iTextSharp can be used for converting a document to pdf.
But first we have to create a document from scratch using iTextSharp.text.Document and then adding elements to this document.
What if I have an existing doc file, is it possible to convert this document to pdf using iTextSharp.
Also, I want to use iTex...
Hi there,
Im using Jasper Reports to generate a word (docx) document but I have a problem when I want to try to print the doc. The exporter messes up the margins of the page. Does anyone know how to prevent that from happening.
I know how to set the margin in iReport, but it just makes the data generate further from the page borders, b...
This is a bit more of a fun question than a serious one, but how does the Adobe PDF format make documents so... portable?
I just created a small Word document, 235kb in size, containing multiple color photos and a few textual phrases. A PDF created using CutePDF (which I understand isn't the most efficient method of PDF creation) is o...
Looking to develop server-side application that will process documents. The source documents are mostly MS-Word 2003, 2007, i.e. the MS version of Docx. Want the server application to be able to run on both linux or windows.
Wanting to know what is the best tool or library for reading and writing MS-Word files under linux. Compatibility...
We have a 32-bit .NET application which makes use of the 32-bit version of DSOFile.dll 2.1 to read common properties from Office documents. This works on 32-bit versions of Windows, for both Office 2003 and Office 2007 documents. We are now examining our application’s behaviour in a 64-bit environment, and specifically in 64-bit Windows ...
Using the OpenXML SDK, 2.0 CTP, I am trying to programmatically create a Word document. In my document I have to insert a bulleted list, an some of the elements of the list must be underlined. How can I do this?
...
Hi,
What options do I have to convert .docx documents to .doc document programmatically using C#? I'm looking to do this as cheaply as possible. Ideally I want to do this directly in code via libraries within the .net framework or via a well establish downloadable dll.
The one constraint we have is that we can't install Office onto our ...
The call center managers for my company use document libraries in a SharePoint 2007 site to post training material and information to our phone reps. These reps are given read-only access to the libraries as to not change the documents posted by management, however we find that if management uploads an Office 07 document (either docx or...
Hi,
I am using dsoFramer to open docx files in winforms. When a docx file is opened in dsoframer and user open the same document from windows explorer, the application becomes unstable. How can I prevent opening the file from windows explorer when it is open in dsoFramer.
Best regards.
...
Hello,
I'm trying to read Word 2007 docx document.
The document looks fine inside Word, but when i try to read id using my code, all Run objects have RunProperites set null.
The property that I'm most interested in is RunProperies.FontSize, but unfortunately it is null as well, the only property I can access is InnerText.
My Code l...
I have a docx template that I am saving as .xml and then parsing the content.
Then I am generating a new updated word document. After the word document is generated I am unable to open it. It says " document corrupt ". I press ok. Then it says " Press OK if Do you want to retrieve the document ". I press ok. Then I get the updated docume...
We have a couple 3rd party systems that give us PDFs. We would like to convert those PDFs for display on the web without using an Adobe product. Ideally we would like to use Silverlight to render the PDFs but are having trouble converting from a PDF to Xaml or using docx format as a middle man. There are lots of libraries that give PD...