ansaurus

Question

How do I convert LaTeX to plain-text (ASCII)?

Answer 1

+4 A:

CatDVI can convert DVI to text and attempts to preserve the formatting.

Bearddo 2009-02-09 21:45:18

Do you know how to turn off "justified" alignment?

chuckg 2009-02-09 22:35:10

I sure don't, sorry.

Bearddo 2009-02-10 04:08:06

Try piping it through fmt(1) with the `-u` option.

Cirno de Bergerac 2010-01-20 19:36:35

Just remove the excess spacing, e.g. like this `catdvi foo.dvi | perl -pe 's/[ ]+/ /g'` gives me more reasonable output than `fmt`

Frank 2010-05-13 18:44:32

Answer 2

+7 A:

You can try some of the proposed programs here:

TeX to ASCII

Diego Sevilla 2009-02-09 21:45:57

Answer 3

+1 A:

My usual strategy is to use hyperlatex to turn it into a web page, and then cope and paste from a web browser. I find that this gives the best formatting.

I usually then have to go through and manually fix some line-wrapping...

Brian Postow 2009-02-09 21:55:20

I tried this out, but unfortunately it doesn't support using an external `cls` file. I'm using a class file to handle repetitive formatting tasks, along with the enumitem class. Thanks though!

chuckg 2009-02-09 22:02:12

hmmm, I don't think I've had problems with that... but it's been a while since I've used it... and I don't have any of my files at work...

Brian Postow 2009-02-10 14:48:51

Answer 4

+2 A:

Another option is to use htlatex to create a web page from the LaTeX sources, then use links to convert to plain text. I used the command line

links -dump -no-numbering -no-references input.html > output.txt

in the past which gave a rather nice result. This will of course rather match the view of the rendered HTML than the original PDF, thus maybe not exactly what you want.

bluebrother 2009-02-09 23:44:47

Answer 5

A:

you can import into lyx and use lyx's export to text feature.

kind of silly if you don't use lyx but if you already have it, very quick and easy solution. Good result for me, although to be fair my files are pretty simple. Not sure how more elaborate files get converted.

DDD 2009-11-01 19:09:25

Answer 6

A:

Try the steps here: http://zanedp.livejournal.com/201222.html

Here is a sequence that converts my LaTeX file to plain text:

$ latex file.tex
$ catdvi -e 1 -U file.dvi | sed -re "s/\[U\+2022\]/*/g" | sed -re "s/([^^[:space:]])\s+/\1 /g" > file.txt

The -e 1 option to catdvi tells it to output ASCII. If you use 0 instead of 1, it will output Unicode. Unicode will include all the special characters like bullets, emdashes, and Greek letters. It also include ligatures for some letter combinations like "fi" and "fl." You may not like that. So, use -e 1 instead. Use the -U option to tell it to print out the unicode value for unknown characters so that you can easily find and replace them.

The second part of the command finds the string [U+2022] which is used to designate bullet characters (•) and replaces them with an asterisk (*).

The third part eats up all the extra whitespace catdvi threw in to make the text full-justified while preserving spaces at the start of lines (indentation).

After running these commands, you would be wise to search the .txt file for the string [U+ to make sure no Unicode characters that can't be mapped to ASCII were left behind and fix them.

2010-01-20 19:24:33

ansaurus

tags:

views:

answers:

How do I convert LaTeX to plain-text (ASCII)?

related questions