views:

178

answers:

2

Like csv.reader() are there any other functions which can read .rtf, .txt, .doc files in Python?

+3  A: 

You can read a text file with

txt = open("file.txt").read()

Try PyRTF for RTF files. I would think that reading MS Word .doc files are pretty unlikely unless you are on Windows and you can use some of the native MS interfaces for reading those files. This article claims to show how to write scripts that interface with Word.

Jesse Dhillon
+2  A: 

csv is a specific format so you need a "parser" to read it. This is what the csv module provides as you've mentioned. Text files (usually suffixed with .txt) don't have any fixed "format" so you can just read them after opening them (Jesse's answer gives the details). CSV files are commonly text files so your distinction is not very accurate.

As for RTF, There are a bunch of them. See this answer for details. The PyRTF thing which Jesse mentioned seems to be the most popular though.

Microsoft Word document files (usually suffixed with .doc) are another beast since the format is proprietary. I don't have much experience with Python converters but there are a few command line ones (like wvHTML) which do a somewhat decent job. This question discusses quite a few. There's also the option of having MS-Word itself do that for you via. a COM interface like Jesse has mentioned.

Noufal Ibrahim