Hi Guys,
Can someone who is way smarter than I tell me what I'm doing wrong.. Shouldn't this simply process...
# encoding: utf-8
from email.MIMEText import MIMEText
msg = MIMEText("hi")
msg.set_charset('utf-8')
print msg.as_string()
a = 'Ho\xcc\x82tel Ste\xcc\x81phane '
b = unicode(a, "utf-8")
print b
msg = MIMEText(b)
msg.set_cha...
This question is very similar to that one, but I need to do the same thing in C, not python. Here are some examples of what the function should do:
input output
< <
> >
ä ä
ß ß
The function should have the signature char *html2str(char *html) or similar. I'm not reading byte by byte from a stream.
Is t...
Until recently, my blog used mismatched character encoding settings for PHP and MySQL. I have since fixed the underlying problem, but I still have a ton of text that is filled with garbage. For instance, ï has become ï.
Is there software that can use pattern recognition and statistics to automatically discover broken text and fix it?
...
Which widely used programming languages were designed ground-up with Unicode support?
A lot of programming languages have added Unicode support as an afterthought in later versions, but which widely used languages were released with Unicode support from day one?
...
From what I understand, when MySQL compares a string stored in utf8_general collation, it first converts it's characters to their ascii equivalents. In other words ḩ = h, ţ = t, ā = a, í = i, etc...
Is there a mapping table which I could use to implement similar comparison function in php or javacript? I know there are alternatives in p...
In Perl, I can say
my $s = "r\x{e9}sum\x{e9}";
to assign "résumé" to $s. I want to do something similar in C. Specifically, I want to say
sometype_that_can_hold_utf8 c = get_utf8_char();
if (c < '\x{e9}') {
/* do something */
}
...
Heya guys,
I'm in desperate need of help.
I have a Java servlet that is accessed by a HTTP Get URL with eight parameters in it.
The problem is that the parameters are not exclusive to English.
Any other language can be in those parameters, like Hebrew, for example.
Now, when I send the data - either from the class that is supposed to...
I'm looking for the best option to store my application settings. I decided to write own class that inherits from TPersistent which would store all the config options available. Currently I'm looking for the best way to save it - and I found JvAppStorage which looked very promising (as I'm using JVCL in my project anyway...) but it doesn...
Hello, apologies if this is silly. How do I print a Unicode character, say \u20ac using an integer? So, instead of Console.WriteLine("\u20ac");, I would like to pass the integer 8364.
Thanks.
...
Please see here for a related question.
However, char goes to 0xffff (or 65535). I need to write 0xd800df46 (or 66374), Gothic letter Faihu, so casting that int to char will not work. I do the conversion ok, that is, I get the correct integer, meaning I calculate the surrogate pairs ok, but I don't know how to "render" it, convert it t...
We are having trouble getting a Unicode string to convert to a UTF-8 string to send over the wire:
// Start with our unicode string.
string unicode = "Convert: \u10A0";
// Get an array of bytes representing the unicode string, two for each character.
byte[] source = Encoding.Unicode.GetBytes(unicode);
// Convert the Unicode bytes to U...
I'm having a problem emailing unicode characters using smtplib in Python 3. This fails in 3.1.1, but works in 2.5.4:
import smtplib
from email.mime.text import MIMEText
sender = to = '[email protected]'
server = 'smtp.DEF.com'
msg = MIMEText('€10')
msg['Subject'] = 'Hello'
msg['From'] = sender
msg['To'] = to
s = smtplib.SM...
For example, in one Unicode normal form á is always represented as an unaccented letter a and a combining accent mark, in another it must be a single pre-combined Unicode character. How would I convert between these forms in PHP?
...
I just came across something like this:
String sample = "somejunk+%3cfoobar%3e+morestuff";
Printed out, sample looks like this:
somejunk+<foobar>+morestuff
How does that work? U+003c and U+003e are the Unicode codes for the less than and greater than signs, respectively, which seems like more than a coincidence, but I've never ...
I have Delphi 2007 code that looks like this:
procedure WriteString(Stream: TFileStream; var SourceBuffer: PChar; s: string);
begin
StrPCopy(SourceBuffer,s);
Stream.Write(SourceBuffer[0], StrLen(SourceBuffer));
end;
I call it like this:
var
SourceBuffer : PChar;
MyFile: TFileStream;
....
SourceBuffer := StrAlloc(1024);
MyFi...
I have a RESTful WCF service which accepts GET verbs with Unicode encoded urls. The Unicode characters are translated as little boxes strangely when I get the data on the server.
Is there something I have to tell the service contract to do in order to get Unicode UrlEncoded Gets to translate into nice strings?
Here's my contract:
[Ope...
The JDK's String.trim() method is pretty naive, and only removes ascii control characters.
Apache Commons' StringUtils.strip() is slightly better, but uses the JDK's Character.isWhitespace(), which doesn't recognize non-breaking space as whitespace.
So what would be the most complete, Unicode-compatible, safe and proper way to trim a s...
I'm getting a rather odd error message when attempting to wcout a wstring in vc++ 2008 express:
error C2679: binary '<<' : no operator found which takes a right-hand operand of type 'std::wstring' (or there is no acceptable conversion)
If I understand this correctly it's reporting that wcout does not accept a wstring? I ask someone...
I have a third party font with support for japanese characters which I need to use for an application. Whenever a character is not supported by this font, the often seen rectangle ("default character") is drawn. Obviously not all japanese characters are supported, because if I try to draw the translations that our translation office gave...
C#, ASP.NET 3.5
I create a simple URL with an encoded querystring:
string url = "http://localhost/test.aspx?a=" +
Microsoft.JScript.GlobalObject.escape("áíóú");
which becomes nicely: http://localhost/test.aspx?a=%E1%ED%F3%FA (that is good)
When I debug test.aspx I get strange decoding:
string badDecode = Request.QueryString[...