ansaurus

Question

Python: concatenating bytes with a string

Answer 1

+1 A:

The problem is that "%s:%s:%s" became a unicode string once you imported unicode_literals. The output of the hash is a "regular" string. Python tried to decode the regular string into a unicode string and failed (as expected. The hash output is supposed to look like noise). Change your code to this:

a1 = a1 + str(':') + str(challenge["nonce"]) + str(':') + str(cnonce)

I'm assuming cnonce and challenge["nonce"] are regular strings. To have more control over their conversion to strings (if needed), use:

a1 += str(':') + challenge["nonce"].encode('UTF-8') + str(':') + cnonce.encode('UTF-8')

Tal Weiss 2010-07-01 12:46:17

This solution and explanation also works. Thank you.

Macdiesel 2010-07-01 13:38:31

Answer 2

+1 A:

The reason for the behaviour you observed is that from __future__ import unicode_literals switches the way Python works with strings:

In the 2.x series, strings without the u prefix are treated as sequences of bytes, each of which may be in the range \x00-\xff (inclusive). Strings with the u prefix are ucs-2 encoded unicode sequences.
In Python 3.x -- as well as in the unicode_literals future, strings without the u prefix are unicode strings encoded in either UCS-2 or UCS-4 (depends on the compiler flag used when compiling Python). Strings with the b prefix are literals for the data type bytes which are rather similar to pre-3.x non-unicode strings.

In either version of Python, byte-strings and unicode-strings must be converted. The conversion performed by default depends on your system's default charset; in your case this is UTF-8. Without setting anything, it should be ascii, which rejects all characters above \x7f.

The message digest returned by hashlib.md5(...).digest() is a bytes-string, and I suppose you want the result of the whole operation to be a byte-string as well. If you want that, convert the nonce and cnonce-strings to byte-strings.:

a1 = hashlib.md5("%s:%s:%s"  % (self.username, self.domain, self.password)).digest()
# note that UTF-8 may not be the encoding required by your counterpart, please check
a1 = b"%s:%s:%s" %(a1, challenge["nonce"].encode("UTF-8"), cnonce.encode("UTF-8") )

Alternatively, you can convert the byte-string coming from the call to digest() to a unicode string (not recommended). As the lower 8 bit of UCS-2 are equivalent to ISO-8859-1, this might serve your needs:

a1 = hashlib.md5("%s:%s:%s"  % (self.username, self.domain, self.password)).digest()
a1 = "%s:%s:%s" %(a1.decode("ISO-8859-1"), challenge["nonce"], cnonce)

nd 2010-07-01 13:08:29

The first solution worked with the code. Thank you for your insightful answer.

Macdiesel 2010-07-01 13:35:57

ansaurus

tags:

views:

answers:

Python: concatenating bytes with a string

related questions