encoding

Reading UTF-16 (or UTF-8) values from XML and displaying result with PHP

Hi, I'm having a lot of trouble with unicode (UTF-16) values and PHP/XML. I want to read a set of unicode values from XML and output the correct glyphs to the browser. I've tried with UTF-8 and I get the same problem. This is a simple working example I used for my first test: $text = "\x00\x41"; $text = mb_convert_encoding($text, "AS...

Am I passing the string correctly to the python library?

I'm using a python library called Guess Language: http://pypi.python.org/pypi/guess-language/0.1 "justwords" is a string with unicode text. I stick it in the package, but it always returns English, even though the web page is in Japanese. Does anyone know why? Am I not encoding correctly? §ç©ºéå ¶ä»æ¡å°±æ²æéç¨®å¾ ...

Server.UrlEncode apostrophe(') in Firefox

So I have a Hyperlink called lnkTwitter: And I'm trying to set the url in the code behind: lnkTwitter.NavigateUrl = string.Format("http://www.twitter.com/home?status={0}", Server.UrlEncode("I'm Steven")); When I do that and hover over the link, the url displays correctly in the status bar as "http://www.twitter.com/home?status=I'm+Ste...

Recommend Server Side Flash Encoding Components with .NET SDK

I'm looking for some testimonials for components that take video with a wide range of formats (.avi, .mov, .mpeg) and encode them into .flv. We'll want the following: Be able to encode from a large list of common video/audio formats. Create thumbs for the media that it encodes. Have a good SDK (.NET preferred) that we can use to even...

Get rid of ASCII characters in the output of HTML parsed by DOMdocument

Let's say I have this code, adapted from Adam Backstrom's answer to a previous question: $term = 'example'; // word I need to replace $replacement = '<strong>example</strong>'; // this will replace the $term $d = new DOMDocument; @$d->loadHTML($body); // specifically, drupal's $node->content['body']['#value'] in hook_nodeapi when $op='...

Handling unicode values in GET parameters with PHP

I have the following test script on my server: <?php echo "Test is: " . $_GET['test']; ?> If I call it with a url like example.com/script.php?test=ɿ (ɿ being a multibyte character), the resulting page looks like this: Test is: É¿ If I try to do anything with the value in $_GET['test'], such as save it a mysql database, I have th...

PHP encoding "les validations d'entr\u00e9es"

I have text in this format les validations d'entr\u00e9es instead of les validations d'entrées. This is from Twitter .json API and I would like to translate the \u00e9 to the é but can't find a way to do it. I suppose it's unicode so how can I translate those characters in PHP? Sample of code that I already have: $this->jsonArray =...

Encrypting Scripts for Embedding in Text Files

I'm working on a closed-source game that uses a scripting language for automation. Almost all of the game logic is handled by scripts. Scripts can be compiled to a bytecode format, but due to the nature of the language, identifiers must be preserved. Compiled scripts can be embedded in other text-based resource formats using a binary-to-...

How to properly serve a PDF file

I am using .NET 3.5 ASP.NET. Currently my web site serves a PDF file in the following manner: context.Response.WriteFile(@"c:\blah\blah.pdf"); This works great. However, I'd like to serve it via the context.Response.Write(char [], int, int) method. So I tried sending out the file via byte [] byteContent = File.ReadAllBytes(ReportP...

Why does HTTP::Response::decoded_content sometimes return undef even when content() return data?

I've used LWP capability to handle gzip encoded content as described here, but in some cases I randomly get unexpected results at least for the one website I've tested: $response->decoded_content could become undefined while $response->content still returns original gzip encoded response. Tried even without internal charset decoding (dec...

Safe Data serialization for Plain HTTP GET & POST communication

Hello Friends, I'm using the client's browser to submit HTTP request. For report generation the securityToken is submitted as POST, for report download the same token needs to be submitted by the user browser, this time using GET. What encoding would you recommend for the securityToken which actually represents encrypted data. I've t...

Should source code be saved in UTF-8 format

How important is it to save your source code in UTF-8 format? Eclipse on Windows uses CP1252 character encoding by default. The CP1251 format means non UTF-8 characters can be saved and I have seen this happen if you copy and paste from a Word document for a comment. The reason I ask is because out of habit I set-up Maven encoding to b...

Change Background Image of Div When Modal Popup Window Opens

Hi Everyone, I'm trying to dynamically change the background image of a div inside of a modal popup window. I first tried it with simplemodal and now I am trying it with the jqueryui dialog box. I can't get it to work on either one. Here is my code so far: //Jquery Dialog Attempt: //I have also tried it in the open event ...

Will you need to declare a pages encoding in PHP 6?

I recently saw this as the very first line of some PHP scripts in some framework, it said that it is added in because PHP 6 will support it. I am curious if anyone knows anything about this? If it does support it when PHP 6 comes out will it be optional? Any benefits of using it vs not using it? declare(ENCODING = 'utf-8'); ...

Java File parsing toolkit design, quick file encoding sanity check

(Disclaimer: I looked at a number of posts on here before asking, I found this one particularly helpful, I was just looking for a bit of a sanity check from you folks if possible) Hi All, I have an internal Java product that I have built for processing data files for loading into a database (AKA an ETL tool). I have pre-rolled stages ...

Can you encode to less bits when you don't need to preserve order?

Say you have a List of 32-bit Integers and the same collection of 32-bit Integers in a Multiset (a set that allows duplicate members) Since Sets don't preserve order but List do, does this mean we can encode a Multiset in less bits than the List? If so how would you encode the Multiset? If this is true what other examples are there wh...

How do I inject a script URL containing an ampersand with ASP.NET?

I have a server control that needs to programmatically inject a JavaScript reference into the page. It is to reference Microsoft's Bing map control which requires &s=1 to be appended to the script URL for use over SSL. The problem is that the .NET Framework encodes the attributes and changes the & to an &amp; (verified with Reflector). A...

how to correct the misencoded string?

i used mutagen to read the mp3 metadata, since the id3 tag is read in as unicode but in fact it is GBK encoded. how to correct this in python? audio = EasyID3(name) title = audio["title"][0] print title print repr(title) produces µ±Äã¹Âµ¥Äã»áÏëÆðË­ u'\xb5\xb1\xc4\xe3\xb9\xc2\xb5\xa5\xc4\xe3\xbb\xe1\xcf\xeb\xc6\xf0\xcb\xad' but in ...

Java string encoding conversion within a webpage

Hi, I have a webpage that is encoded (through its header) as WIN-1255. A Java program creates text string that are automatically embedded in the page. The problem is that the original strings are encoded in UTF-8, thus creating a Gibberish text field in the page. Unfortunately, I can not change the page encoding - it's required by a c...

Load XMLDocument from byte array (optionally containing BOM characters)

I've seen several posts here on SO about loading XML documents from some data source where the data has Microsoft's proprietary UTF-8 preamble (for instance, this one). However, I can't find an elegant (and working!) solution which does not involve striping out BOM characters manually. For instance, there is this example: byte[] b = Sy...