views:

17

answers:

1

I have a script which tracks visits & referers to a website.

I send the document.referrer (I use escape() in javascript) to the server and store the string in the database, after decoding it using HttpUtility.HtmlDecode (C#).

For most cases, I can parse the referer string and show hebrew characters, but there are a few cases which I cannot.

I found that the two strings are different (the one displays right and the one the doesn't)

The one that displays right contains these kind of characters: http://www.google.co.il/search?hl=iw&source=hp&q=%D7%99%D7%91%D7%95%D7%90%D7%A0%D7%99%D7%9D %D7%9C%D7%9E%D7%AA%D7%A0%D7%95%D7%AA &meta=&aq=f&oq=

The ones that doesn't display properly (unless I use Microsoft.JScript.GlobalObject.unescape) look like this: http://www.google.co.il/custom?q=%FA%EE%E9%F8 - %F6%E9%E9 %F8%EB%E1&client=pub-0385896995839253&forid=1

I can understand that the second string contains ISO-8859-1 characters, and works properly when unescaped on the server side, but there is no encoding information as part of a url

so, I cannot distinguish between these two formats. or can I? should I?

A note: when I copy & paste those urls in the browser address bar, the browser detects the first one as "Unicode(UTF-8)" and the other one as "Windows-1255"

Thanx Yaron

A: 

Use the encodeURIComponent function instead of the escape function.

If you are reading the value from the Request.QueryString collection it's already decoded, so you should not use the HtmlDecode method.

Guffa