tags:

views:

163

answers:

3

The title really says it all! I have a .rtf document (with an image, it's not just text), what haskell libraries are there out there to help me in my quest, or is it way easier than it appears?

A: 

The first tool I would turn to is pandoc, however, it looks like it can only write .rtf, not parse it. Similarly txt2rtf supports writing .rtf, not reading it.

On the pdf side, HPDF has support for generating pdfs, and HsHaruPDF has some support for reading pdfs. line2pdf can generate pdf from ascii input.

Is it possible to convert the .rtf into a form pandoc can recognize?

Don Stewart
It includes images, so I think not. I saw markdown et al in there, and they're all text-only.
Clark Gaebel
If I give pandoc markdown, does it generate pdfs? Last I checked, the backend only supports generating latex.
Clark Gaebel
If you don't need to distribute this program, the best way to read RTF is to use Microsoft Word. I don't know if Haskell has COM interop (or whatever it's called) to be able to talk to Word, but this is the solution I found a few years ago in Scheme (or maybe it was Python).
Nathan Sanders
I need to run this on both linux and windows.
Clark Gaebel
I'm not sure what your purpose is. I usually read .doc documents that are emailed to me by uploading them to documents.google (convenient since I usually get them at a gmail account) then downloading html; it also handles rtf, of course. The html can be reliably parsed by pandoc, with a few qualifications; you can go wherever you please from there with all the accompanying writers. The pandoc package comes with a markdown2pdf executable which makes a pdf via pdflatex; you can adjust the latex template as you please.
applicative
+1  A: 

Call a web service to do the work such as the PDF Converter Services. It supports RTF.

I worked on this product so obviously I am biased. It works very well though, lots of happy users.

Muhimbi
Ah, a perfect answer! When you want haskell to do something useful, use it to call something really working! :-)
Vanya
That's Perl's job :-P
luqui
+3  A: 

Some years ago, I wrote a parser (in Perl) for a very limited and specialized subset of RTF, and even that was a huge project. It would be great if you want to write a general RTF parser in Haskell; but if you need to get work done, I recommend using an existing product.

Besides MS Word and web services suggested by others, here are a few other open source possibilities:

  • OpenOffice.Org has a good cross-platform RTF parser, though it might take some work to get it to run without human intervention.

  • GNU UnRtf

  • rtfreader, a port to Unix of Microsoft's reference parser.

  • rtf2latex2e

  • rtf2html

  • rtf2tex, rtf2latex, rtf2text, and rtf2troff for Unix from the early 1990's are still available, they might even still work on modern systems.

All except the last are available on MacPorts. Check your local Linux distribution for availability there. Follow the above links to see which of the above are available for Windows.

All of the above are in C, so it's possible to create Haskell bindings to them using FFI, with varying degrees of difficulty. The only one which I would expect to be really hard is OpenOffice.Org.

Yitz