Basically I have a database where I get $lastname, $firstname, $rid, $since, $times and $ip from.
Using a Perl script, I format the data to send it via e-mail. Since the $lastname and $firstname can contain special chars (for instance ä, ü, ß, é,...) I first decode the strings.
my $fullname = decode("utf8", $lastname) . ', ' . decode("...
Is there a python convention for when you should implement __str__() versus __unicode__(). I've seen classes override __unicode__() more frequently than __str__() but it doesn't appear to be consistent. Are there specific rules when it is better to implement one versus the other? Is it necessary/good practice to implement both?
...
Emacs 23 is running on a remote Linux box. It displays its frame on this local Windows box, using Cygwin's X server. I used to be able to copy-paste any text from Emacs to any Windows application. Since after I upgraded from release 22 to 23, combining diacritics don't come through any more.
Non-combined characters pass unharmed. Fo...
I got a .vcf file with parts encoded as UTF-8:
CATEGORIES;CHARSET=UTF-8:Straße & –dienste
Now "–" should be a "-" and "Straße" should convert to "Straße".
I tried
utf8_decode()
iconv()
mb_convert_encoding()
And have been playing with several output encoding options like
header('content-type: text/html; charset=utf-8');
mb...
I have an Expando model kind in my App Engine datastore and I'm setting many arbitrary property names. I didn't consider that I couldn't store Unicode property names, and now I'm in a troubling situation where any attempt to fetch entities of this kind, or even deleting them to get rid of the offender get the following error:
Traceback ...
We have a Japanese client that has source code in COBOL on an mainframe. He claims the code on the mainframe is represented in Shift-JIS2 (and we think we understand that pretty well). When that code is transferred to an PC, what is the most common encoding used?
We've sent him a program to process that COBOL code and it seems to cho...
I have been looking around ocn Google and Stackoverflow but haven't found what I needed, but my question seems quite simple. Anyhow;
What is the way to convert a string of RTF special characters such as "\'d3\'d6" (In this case Russian) to unicode chars or string using C#?
...
Hi all,
I'm coding up a new (personal hobby) app for Windows in c++.
In previous low-level Windows stuff I've used _TCHAR (or just TCHAR) arrays/basic_strings for string manipulation.
Is there any advantage to using _TCHAR over straight up Unicode with wchar_t, if I don't care about Windows platforms pre Win2k?
edit: after submitti...
I would like to make sure that everything I know about UTF-8 is correct. I have been trying to use UTF-8 for a while now but I keep stumbling across more and more bugs and other weird things that make it seem almost impossible to have a 100% UTF-8 site. There is always a gotcha somewhere that I seem to miss. Perhaps someone here can corr...
We are migrating our C++ COM application to be unicode, and as part of this migration we want to migrate the constant strings in our IDL to unicode as well.
The problem is that at the moment, we still compile it both in ANSI and in UNICODE, which means that we can't use the L"String" construct to declare wide charts.
At the moment, our...
I have a VC++ project in Visual Studio 2008.
It is defining the symbols for unicode on the compiler command line (/D "_UNICODE" /D "UNICODE"), even though I do not have this symbol turned on in the preprocessor section for the project.
As a result I am compiling against the Unicode versions of all the Win32 library functions, as o...
I have a Unicode (UTF-8 without BOM) text file within a jar, that's loaded as a resource.
URL resource = MyClass.class.getResource("datafile.csv");
InputStream stream = resource.openStream();
BufferedReader reader = new BufferedReader(
new InputStreamReader(stream, Charset.forName("UTF-8")));
This works fine on Windows, but on Lin...
I have a large MFC application that I am extending to allow for multi-lingual input. At the moment I need to allow the user to enter Unicode data in edit boxes on a single dialog.
Is there a way to do this without turning UNICODE or MBCS on for the entire application? I only need a small part of the application converted at the moment...
Unfortunately, the Unicode 0.1 (sudo gem install unicode) doesn't work on Ruby 1.9. I have the following snippet:
require "rubygems"
require "unicode"
str = "áéíóúç"
Unicode.normalize_KD(str).gsub(/[^\x00-\x7F]/n, "")
#=> aeiouc
I use it to convert titles to permalink, without removing accented characters.
Is there a way of converti...
I have the following code:
import string
def translate_non_alphanumerics(to_translate, translate_to='_'):
not_letters_or_digits = u'!"#%\'()*+,-./:;<=>?@[\]^_`{|}~'
translate_table = string.maketrans(not_letters_or_digits,
translate_to
*len(not_lette...
The perldoc page for length() tells me that I should use bytes::length(EXPR) to find a Unicode string in bytes, or and the bytes page echoes this.
use bytes;
$ascii = 'Lorem ipsum dolor sit amet';
$unicode = 'Lørëm ípsüm dölör sît åmét';
print "ASCII: " . length($ascii) . "\n";
print "ASCII bytes: " . bytes::length($ascii) . "\n";
prin...
I am trying to sync up a SQL Server table with a Lotus Notes database. I have set up the NotesSQL ODBC driver and have been able to insert, update and select from the notes database form using the ActiveX Script Task in DTS. Everything works well until I try to insert Chinese characters into Text field in the notes database. After insert...
Hi I have a problem in python. I try to explain my problem with an example.
I have this string:
>>> string = 'ÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ'
>>> print string
ÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ
and i want, for example, replace charachters different from Ñ,Ã,ï with ""
i have tried:
>>> rePat =...
Searching a file which is written in Hindi(Devanagri) (UTF-16) gave rise to the following problem.
The file contains:
त्रास ततत
जुग नींद ना हा बु
Note that the first char 'त्र' is a multiple code point of त + ् + र
Now while searching for 'त' I get 4 matches including the त of the first char. I am using Java.
How can I go abo...
Hello,
I have an NSString that then sets a UILabel. This contains unicode such as...
E = MC Hammer\U00ac\U2264
and complete ones such as
\U2013\U00ee\U2013\U00e6\U2013\U2202\U2013\U220f\U2013\U03c0 \U2013\U00ee\U2013\U220f\U2013\U03c0\U2013\U00aa\U2013\U221e\U2014\U00c5
These are not displaying correctly, is there anythi...