python-re: How do I match an alpha character.
How can I match an alpha character with a regular expression. I want a character that is in \w but is not in \d. I want it unicode compatible that's why I cannot use [a-zA-Z]. ...
How can I match an alpha character with a regular expression. I want a character that is in \w but is not in \d. I want it unicode compatible that's why I cannot use [a-zA-Z]. ...
When I paste in some upper unicode, or even ansi like العربية I get gibberish in MonoDevelop. I am using the MonoTouch framework. Any idea how to get it to allow me to paste in Arabic, Chinese etc.... ian ...
I'm using the following regex basically to search for and delete these characters. invalid_unicode = re.compile(ur'(Û|²|°|±|É|¹|Í)') My source code in ascii encoded, and whenever I try to run the script it spits out: SyntaxError: Non-ASCII character '\xdb' in file ./release.py on line 273, but no encoding declared; see http://www.pyt...
hi im a beginner in programming and network development. i have a question regarding ASCII and Unicode encoding. in msdn and other web examples do the following: byte[] byteData = Encoding.ASCII.GetBytes(data); is this because these code samples are old? shouldn't it be: byte[] byteData = Encoding.Unicode.GetBytes(data); thanks fo...
ParamText() is an really old way of replacing parameters in a string that is based on Pascal strings. Also StandardAlert is not quite Unicode ready. The new message box (not so new) replacement is CFUserNotificationDisplayNotice but this one expects CFString and I found out that if I'm about to switch to using CFString I'm not able to u...
Hi, I am trying to convert this in to readable UTF8 text in PHP Tel Aviv-Yafo (Hebrew: \u05ea\u05b5\u05bc\u05dc\u05be\u05d0\u05b8\u05d1\u05b4\u05d9\u05d1-\u05d9\u05b8\u05e4\u05d5\u05b9; Arabic: \u062a\u0644 \u0623\u0628\u064a\u0628\u200e, Tall \u02bcAb\u012bb), usually called Tel Aviv Any ideas on how to do so? Tried several methods...
Hi I'm developing a Word addin in Delphi 7, but soon I'll upgrade it to Delphi 2010, as you know, since version 2009 Delphi introduces the new string type UnicodeString which equals to the keyword string . On the other hand, according to this thread we need to use WideString to communicate with COM. My question is, what should I do in...
Hi, we have an application written in Java which reads some text generated by a VB6 application. The problem is: this VB6 application generate this output using some special characters, like ç,ã,á which we don't know in what charset. So the question is: is there a default charset used by VB6? Which is it? ...
Hello Guys, i use the UIWebView to load Arabic Html, using UTF8 Unicode, but the rendering is deadly slow, so is the scrolling. on the contrary when using English Html, everything works more reasonable. any advice on how to render unicode Html on the UIWebView?? Appreciate your Help! Thanks. ...
I'm just getting started on some programming to handle filenames with non-english names on a WinXP system. I've done some recommended reading on unicode and I think I get the basic idea, but some parts are still not very clear to me. Specifically, what encoding (UTF-8, UTF-16LE/BE) are the file names (not the content, but the actual nam...
I have a table that has some Unicode in it. I know that the Unicode data is fine as it comes out as JSON on our webserver just fine. But for some reason the CSV that I'm generating is ending up mangled. Here's our current code: my $csv = Text::CSV->new ({ eol => "\015\012" }); open my $fh, '>:encoding(utf8)', 'Foo.csv'; my $sth...
I see that Visual Studio 2008 and later now start off a new solution with the Character Set set to Unicode. My old C++ code deals with only English ASCII text and is full of: Literal strings like "Hello World" char type char * pointers to allocated C strings STL string type Conversions from STL string to C string and vice versa using S...
Strings are usually enumerated by character. But, particuarly when working with Unicode and non-English languages, sometimes I need to enumerate a string by grapheme. That is, combining marks and diacritics should be kept with the base character they modify. What is the best way to do this in .Net? Use case: Count the distinct phonetic ...
I need a regular expression that matches UTF-8 letters and digits, the dash sign (-) but doesn't match underscores (_), I tried these silly attempts without success: ([\w-^_])+ ([\w^_]-?)+ (\w[^_]-?)+ The \w is shorthand for [A-Za-z0-9_], but it also matches UTF-8 chars if I have the u modifier set. Can anyone help me out with this ...
I feel lost with the Regex Unicode Properties presented by RegexBuddy, I cannot distinguish between any of the Number properties and the Math symbol property only seems to match + but not -, *, /, ^ for instance. Is there any documentation / reference with examples on regular expressions Unicode properties? ...
Is there a way of searching for unicode characters inside a text file under Windows XP? For example suppose I wish to find text documents with the euro symbol. Although the standard XP search allows me to search for the euro symbol it does not produce any matches when I know they should be at least a few. Wingrep has the same issue. I...
I am slicing unicode string with diacritics using mb_substr function but it works as I would use simple substr function. It splits unicode characters in half displaying question marked diamond. E.g. echo mb_substr('ááááá', 0, 5); //Displays áá� What might be wrong? ...
I'm working on a FOSS project at http://unicode.codeplex.com. In this project we try to collect some information about standard keyboardlayouts. What we want to know is there a place or document or ... which mention what's the Standard Keyboard Layout for exact language. I mean if you are a German or American or Arab or ... , what's t...
In my Django application, a user has uploaded a file with a unicode character in the name. When I'm downloading files, I'm calling : os.path.exists(media) to test that the file is there. This, in turn, seems to call st = os.stat(path) Which then blows up with the error : UnicodeEncodeError: 'ascii' codec can't encode character u'...
Hi Experts, I am developing a Flex based window application. In that I have used a textArea, Now when I type some characters like ctrl+b, ctrl+e or ctrl+q, it shows some square characters in text area, I think these are some unicode characters but why these are being entered. Unlike in simple textArea control on adobe example when I pr...