tags:

views:

578

answers:

3

hello,

i have probably rather simple question. however, i am just starting to use python and it just drives me crazy. i am following the instructions of a book and would like to open a simple text file. the code i am using:

import sys
try:
 d = open("p0901aus.txt" , "W")
except:
 print("Unsucessfull")
 sys.exit(0)

i am either getting the news, that i was unsucessfull in opening the document or pop up appears saying:

(unicode eror) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape

i have no clue what the problem is. i tried to save the document in different codes, tried different path...always the same problem

does anybody know any help?

thank you very much in advance,

georg ps: i am using windows vista

+2  A: 

Change that to

# for Python 2.5+
import sys
try:
   d = open("p0901aus.txt","w")
except Exception, ex:
   print "Unsuccessful."
   print ex
   sys.exit(0)

# for Python 3
import sys
import codecs
try:
  d = codecs.open("p0901aus.txt","w","utf-8")
except Exception as ex:
  print("Unsuccessful.")
  print(ex)
  sys.exit(0)

The W is case-sensitive. I do not want to hit you with all the Python syntax at once, but it will be useful for you to know how to display what exception was raised, and this is one way to do it.

Also, you are opening the file for writing, not reading. Is that what you wanted?

If there is already a document named p0901aus.txt, and you want to read it, do this:

#for Python 2.5+
import sys
try:
   d = open("p0901aus.txt","r")
   print "Awesome, I opened p0901aus.txt.  Here is what I found there:"
   for l in d:
      print l
except Exception, ex:
   print "Unsuccessful."
   print ex
   sys.exit(0)

#for Python 3+
import sys
import codecs
try:
   d = codecs.open("p0901aus.txt","r","utf-8")
   print "Awesome, I opened p0901aus.txt.  Here is what I found there:"
   for l in d:
      print(l)
except Exception, ex:
   print("Unsuccessful.")
   print(ex)
   sys.exit(0)

You can of course use the codecs in Python 2.5 also, and your code will be higher quality ("correct") if you do. Python 3 appears to treat the Byte Order Mark as something between a curiosity and line noise which is a bummer.

Thomas L Holaday
unfortunately does not help, the program seems to be unable to open the document. is there some other code i may try?cheers
i am getting a syntax error message. it seems to have a problem with the comma behind Exception
What version of Python are you running? At the command line, type "python --version" to find out.
Thomas L Holaday
I am testing my code on Python 2.5.4.
Thomas L Holaday
i am using 3.0, i tried it with an older version and your example worked, but now the rest of my book's examples fail. do you, by any chance, know how to solve the problem in python 3.0 or do you recommend i use an older version?thank you for your help
I added the 3.0 version, with the revised syntax for print and Exception. I included the codecs stuff from sleske's answer, too.
Thomas L Holaday
+3  A: 

(unicode eror) 'unicodeescape' codec can't decode bytes in position 2-4: truncated \UXXXXXXXX escape

This probably means that the file you are trying to read is not in the encoding that open() expects. Apparently open() expects some Unicode encoding (most likely UTF-8 or UTF-16), but your file is not encoded like that.

You should not normally use plain open() for reading text files, as it is impossible to correctly read a text file (unless it's pure ASCII) without specifying an encoding.

Use codecs instead:

import codecs
fileObj = codecs.open( "someFile", "r", "utf-8" )
u = fileObj.read() # Returns a Unicode string from the UTF-8 bytes in the file
sleske
Does open() not understand the encoding cookie?
Thomas L Holaday
if i try your code, nothing happens at all. but at least no error message is returned.
tlholaday: what is an "encoding cookie"?
hop
georg: the code as it is does read the file but doesn't actually do anything with it!
hop
hop: the little nub at the beginning of a file. See http://unicode.org/faq/utf_bom.html#BOM
Thomas L Holaday
@tlholaday: that's called a "BOM". ;-)
Alan Moore
A: 

import csv

data = csv.reader(open('c:\x\list.csv' ))

for row in data:

print(row)

print('ready')

Brings up "(unicode error)'unicodeescape' codec can't decode bytes in position 2-4: truncated \xXX escape"

Try c:\x\list.csv instead of c:\x\list.csv

This is Python 3 code.

DP