Hi,
I have read many similar questions, apologies if this is considered a duplicate.
Suppose I am reading a file containing 3 comma separated numbers. The file was saved with with an unknown encoding, so far I am dealing with ANSI and UTF-8. If the file was in UTF-8 and it had 1 row with values 115,113,12 then:
with open(file) as f:
a,b,c=map(int,f.readline().split(','))
would throw this:
invalid literal for int() with base 10: '\xef\xbb\xbf115'
The first number is always mangled with these '\xef\xbb\xbf' characters. For the rest 2 numbers the conversion works fine. If I manually replace '\xef\xbb\xbf' with '' and then do the int conversion it will work.
Is there a better way of doing this for any type of encoded file?