views:

90

answers:

3

I have a list that I need to send through a URL to a third party vendor. I don't know what language they are using.

The list prints out like this:

[u'1', u'6', u'5']

I know that the u encodes the string in utf-8 right? So a couple of questions.

Can I send a list through a URL? Will the u's show up on the other end when going through the URL? If so, how do I remove them?

I am not sure what keywords to search to help me out, so any resources would be helpful too.

+1  A: 

u'' is not utf-8, its python unicode strings for python 2.x

To send it through url, you need to encode them with utf8 like .encode('utf-8'), and also need to urlencode, and list cannot send it through URL, you need to make it as string.

Basically, you need to do it in following steps

python list -> unicode string -> utf8 string -> url encode -> send it through proper urllib api

S.Mark
*sigh* thanks for the help, this would be so straightforward if I was using PHP
KacieHouser
Your intentions aren't very clear, since sending a list "through a URL" doesn't make much sense without more context. Show us how you'd do it in PHP, and we might be able to help you more easily.
Will McCutchen
Indeed, the lack of modern Unicode support is one of PHP's *weakest* points.
bobince
A: 

Incorrect. unicode literals use Python's internal encoding, decided when it was compiled.

You can't send anything "through" URLs. Pick a protocol instead. And encode before sending, probably to UTF-8.

Ignacio Vazquez-Abrams
+3  A: 

Can I send a list through a URL?

No. A URL is just text. If you want a way to package structured information in it, you'll have to agree that with the provider you're talking to.

One standard encoding for structure in URLs, that might or might not be what you need, is the use of multiple parameters with the same name in a query string. This format comes from HTML form submissions:

http://www.example.com/script?par=1&par=6&par=5

might be considered to represent a parameter par with a three-item list as its value. Or maybe not, it's up to the receiver to decide. For example in a PHP application you would have had to name the parameter par[] to get it to accept the array value.

I know that the u encodes the string in utf-8 right?

No. a u'...' string is a native Unicode string, where each index represents a whole character and not a byte in any particular encoding. If you want UTF-8 bytes, write u'...'.encode('utf-8') before URL-encoding. UTF-8 is a good default choice, but again: what encoding the receiving side wants is up to that application.

Will the u's show up on the other end when going through the URL?

u is part of the literal representation of the string, just the same as the ' quotes themselves. They are not part of the string value and would not be echoed by print or when joined into other strings, unless you deliberately asked for the literal representation by calling repr.

bobince
+1 for concise explanations
S.Mark