questions about character-encoding | ansaurus

character-encoding

CakePHP is truncating a text field, probably encoding related

Here's what I'm trying to do: I'm parsing incoming email, and using it to create posts in the system. This works almost completely, but there's a few bugs to work out. The one that's currently giving fits is coming up when an email contains certain characters (for example, ® – “ ”), the email body is being truncated at the special cha...

character-encoding

Using Unicode with PHP

How do I use Unicode with PHP? I want to store Unicode value in a PHP variable but it output some question marks. What is the solution? ...

character-encoding

Python - letter frequency count and translation.

Hi, I am using Python 3.1, but I can downgrade if needed. I have an ASCII file containing a short story written in one of the languages the alphabet of which can be represented with upper and or lower ASCII. I wish to: 1) Detect an encoding to the best of my abilities, get some sort of confidence metric (would vary depending on the len...

character-encoding

Applying utf8_encode to ob_end_flush()

I have a script which produces text output. That script grabs content from a MySQL database encoded as latin1_general_ci. Including that script in a HTML page marked as iso-8859-1 works fine. How do I capture the output of this script and include it in a HTML page encoded in utf-8? I have attempted to capture the output of the script u...

character-encoding

Special Characters in JavaScript not displaying properly on website

Hi, My IE and Chrome browsers are not displaying the French phrases correctly when I go from a French phrase (onload function) to a English phrase (onmousedown function) and back to a French phrase (onmouseup function). When I let up on the mouse of a particular phrase it goes back to French but the special characters for ô and é (which...

character-encoding

special-characters

Lexers/tokenizers and character sets

When constructing a lexer/tokenizer is it a mistake to rely on functions(in C) such as isdigit/isalpha/... ? They are dependent on locale as far as I know. Should I pick a character set and concentrate on it and make a character mapping myself from which I look up classifications? Then the problem becomes being able to lex multiple chara...

character-encoding

Python returning the wrong length of string when using special characters

I have a string ë́aúlt that I want to get the length of a manipulate based on character positions and so on. The problem is that the first ë́ is being counted twice, or I guess ë is in position 0 and ´ is in position 1. Is there any possible way in Python to have a character like ë́ be represented as 1? I'm using UTF-8 encoding for the...

character-encoding

How to detect which character set encoding in Java?

Does anybody know if there is a simple way to detect character set encoding in Java? It seems to me that some programs have the ability to detect which character set a given piece of data uses, or at least make an aproximation. I suppose the underlying mechanism would have to decode the data in each character set and pick whichever one...

character-encoding

Strange occurence with string and special character

#include <iostream> #include <string> using namespace std; string mystring1, mystring2, mystring3 = "grové"; int main(){ mystring1 = "grové"; getline( cin, mystring2 ); //Here I type "grové" (without "") cout << "mystring1= " << mystring1 << endl; cout << "mystring2= " << mystring2 << endl; cout << "mystring3= " << mystring3...

character-encoding

special-characters

How to write Cyrillic text in C++ console ?

For example, if I write: cout << "Привет!" << endl; //it's hello in Russian in console it would be something like "╧ЁштхЄ!" ok, I know that we can use: setlocale(LC_ALL, "Russian"); but after that not working command line arguments in russian (if I start my program through BAT file): StartProgram.bat chcp 1251 MyProgram.exe -use...

character-encoding

struts character encoding problem in response html

Hi Please consider the following scenario. I have a form with a property: class MyForm extends ActionForm{ String myProperty; ... // getter & setters here } I set this property in action class: class MyAction extends Action{ ... // execute method begins here myForm.setMyProperty("<b>Hello World</b>"); ... // execute...

character-encoding

How do I verify that a string is in English?

I read a string from the console. How do I make sure it only contains English characters and digits? ...

character-encoding

Haskell: Parsing escape characters in single quotes

I'm currently making a scanner for a basic compiler I'm writing in Haskell. One of the requirements is that any character enclosed in single quotes (') is translated into a character literal token (type T_Char), and this includes escape sequences such as '\n' and '\t'. I've defined this part of the scanner function which works okay for m...

character-encoding

Encoding Problems With ID3 Tags

I have an ID3v1 tag that shows up in iTunes like: "It's Been A While". But when I read the tags with the libtag library "It¹s Been A While" comes out. Now when I open the file with a hex editor, I can see that it actualy is 0xB9 which is ¹ on Latin-1 and UTF-8/16. So how does Itunes get a ’ from 0xB9? Any ideas? Is there any character en...

character-encoding

Are character encoding issue causing my Perl output to look like gibberish?

I'm running a Perl script (both with 5.8.4) on two different machines (one Solaris 5.10, the other OpenSolaris 5.11). The output of the two scripts differs in the following way: Solaris 5.10 $ perl myscript.pl is' £ ä º <ä ¼ sa ... ³ ä º žÃ ... ¬ å ¸ ç ¬ ¬ ä º ¤ § œâ is œâ ¡ä ¸ ‡ å ... æœ ¬ æœ ¬ å ¸ È, ¡ä »½ çš" å ... ¬ ...

character-encoding

how to read the parameters and value from the querystring using java

Hi, I am using Ciui from google code and all the requests are only GET requests and not POST. The calls are made by the ajax (i am not sure) . I need to know how to read the "searchstring" parameter from this url. When i read this in my servlet using the getQueryString() method i am not able to properly form the actual text. This unicod...

character-encoding

java utf-8 encding problem

i am using an HTML parser called HTMLCLEANER to parse HTML page the problem is that each page has a different encoding than the other. my question Can i change from any character encoding to UTF-8? ...

character-encoding

What's the difference between encoding and charset?

I am confused about the text encoding and charset. For many reasons, I have to learn non-Unicode, non-UTF8 stuff in my upcoming work. I find the word "charset" in email headers as in "ISO-2022-JP", but there's no such a encoding in text editors. (I looked around the different text editors.) What's the difference between text encoding a...

character-encoding

ASP.NET requestEncoding and responseEncoding UTF-8 or ISO-8859-1

In a Microsoft Security Document, in the Code Review section ( http://msdn.microsoft.com/en-us/library/aa302437.aspx ), it suggests setting the globalization.requestEncoding and globalization.responseEncoding to "ISO-8859-1" opposed to "UTF-8" or another Unicode format. What are the downsides to using "ISO-8859-1", in the past I've set ...

character-encoding

Problem encoding string to ISO8859-1

Hi I'm using this code to convert string to ISO8859-1 baseurl = "http://myurl.com/mypage.php" client = New WebClient client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)") client.QueryString.Add("usuario", user) client.Qu...

character-encoding

1
...
23
24
25
26
27
...
51