views:

411

answers:

3

Hello all i need to understand html + css files and convert it to somthing like rtf layot in java now i understand i need somekind of html parser but what i need to do from there ? how can i implement html-css convertor ? is there somekind of patern or method for such jobs?

+1  A: 

You should check out HTMLEditorKit. It provides some support for CSS rendering. There is also an RTFEditorKit for writing, although it is not entirely reliable (last I checked, several years ago).

Is there a reason you need to use Java instead of just loading the HTML in Word (or some other editor) and saving it as RTF? Also check this W3C link.

Kathy Van Stone
its have to be batch convectorsomething that needs to process allots of files
The link points to a number of headless transformers -- you may want to check those out.
Kathy Van Stone
+1  A: 

I'd do the following:

  1. At first use JTidy to convert HTML to valid XHTML
  2. Apply an XSLT to convert to RTF using an XML library like Saxon or Xerces

Note: although I didn't find an xsl file for that conversion directly I'm sure there is one anywhere

dhiller
An XSLT won't take account of the HTML document's CSS styling. (Or will it? I may be wrong.)
Andrew Duffy
@Andrew Duffy: Well if it does not, it's a bad xslt. Although there my be a problem with external css, you can download that and insert it inline into the document to transform.
dhiller
@Andrew Duffy: Of course you are right, it won't, cause the css is no xml... Stupid me... I'll go get some rest ;-)
dhiller
A: 

There is the The Flying Saucer Project that let you render XHTML to PDF. Maybe that could be used instead of RTF or the resulting PDF could be converted to RTF?

Tim Büthe