Hi all,
I've got the following regular expression that works fine on my testing server, but just returns an empty string on my hosted server.
$text = preg_replace('~[^\\pL\d]+~u', $use, $text);
Now I'm pretty sure this comes down to the hosting server version of PCRE not being compiled with Unicode property support enabled. The diffe...
Using the following code, I can download the HTML of a file from the internet:
WebClient wc = new WebClient();
// ....
string downloadedFile = wc.DownloadString("http://www.myurl.com/");
However, sometimes the file contains "interesting" characters like é to é, ← to ↠and フシギダネ to フシギダãƒ.
I think it may be something to do...
I am trying to learn python and couldn't figure out how to translate the following perl script to python:
#!/usr/bin/perl -w
use open qw(:std :utf8);
while(<>) {
s/\x{00E4}/ae/;
s/\x{00F6}/oe/;
s/\x{00FC}/ue/;
print;
}
The script just changes unicode umlauts to alternative ascii output. (So the complete ...
I'm looking for an html/ascii character which is a triangle up and down so that I can use it as a toggle switch.
I found , and - but those have an arrow "stem". I'm looking just for the html arrow "head".
...
Which character encoding (or combinations of encodings) represents the character ö (U+00F6, LATIN SMALL LETTER O WITH DIAERESIS or simply put chr(246) in ISO-8859-1) as the four octets combination chr(195) . chr(63) . chr(194) . chr(164)?
...
I am aware that any Unicode character can be inserted into an HTML document via the following format:
�
...where 0000 is the character code of the desired character
My question is: which of these characters has the most widespread availability when it comes to the client's browser being able to display the character?
In other...
I have a piece of code that looks like this:
Dir.new(path).each do |entry|
puts entry
end
The problem comes when I have a file named こんにちは世界.txt in the directory that I list.
On a Windows 7 machine I get the output:
???????.txt
From googling around, properly reading this filename on windows seems to be an impossible task. Any s...
Our publishing workflow includes Windows and Linux machines (there are some Macs too, but not in the critical-path workflow). Many texts include both English and Khmer and are marked-up in XML.
XML Copy Editor is the best cross-platform open-source XML editor I've discovered. It utilizes the Scintilla editing component, which is general...
If you are doing automation on windows and you are redirecting the output of different commands (internal cmd.exe or external, you'll discover that your log files contains combined Unicode and ANSI output (meaning that they are invalid and will not load well in viewers/editors).
Is it is possible to make cmd.exe work with UTF-8? This qu...
Is there any solution for dompdf unicode.
...
Hi,
Kangxi radicals in the range 2F00-2FDF (see link text) are not displayed correctly on the iPhone device. They appear as a crossed-out box. In the simulator they display correctly.
I tried the system font and also the
[UIFont fontWithName:@"STHeitiTC-Medium" size:24];
... Is the unicode codepoint coverage limited on the iphone...
Given a java.lang.String instance, I want to verify that it doesn't contain any unicode characters that are not ASCII alphanumerics. e.g. The string should be limited to [A-Za-z0-9.]. What I'm doing now is something very inefficient:
import org.apache.commons.lang.CharUtils;
String s = ...;
char[] ch = s.toCharArray();
for( int i=0; i<...
For one of my opensource project, i need to compute decimal equivalent of given unicode character.
For example if tamil character L'அ' given, output should be 2949 .
I am using c++ in Qt environment. I googled and couldnot find a solution for this. Pls help if you know a solution for this.
...
I´m trying to take a string from a GET or POST parameter in JSP with some accents in UTF-8:
<%@ page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8" %>
<%
request.setCharacterEncoding("UTF-8");
String value = request.getParameter("q");
out.print(value+" | aáa");
%>
The codification of the hardcoded string is co...
I've got a MySQL database table with an ISO-8859-1 encoded text field containing user names. When I export that to a text file using PHP I get a normal text file saved on the client computer. When I open it in Word or Excel on a Windows system, it looks good. When I open it on Mac using Word or Excel, the high-ascii characters are wro...
If I try to paste a unicode character such as the middle dot:
·
in my python interpreter it does nothing. I'm using Terminal.app on Mac OS X and when I'm simply in in bash I have no trouble:
:~$ ·
But in the interpreter:
:~$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type...
I have a large mysql table that I think might be using the wrong character set. If so I'll need to change it using
ALTER TABLE mytable CONVERT TO CHARACTER SET utf8
But since this is a very large table, I'd rather not run this command unless I have to. So my question is, how can I ask mysql what the character set is on a particular t...
Scenario:
Flex application utilizing an @font-face declaration for embedding the font. (Embedded fonts are required to be able to rotate text.)
The application was originally developed as an English application, but during localization it became necessary to locate a unicode font capable of displaying Asian characters. The original im...
#coding: utf-8
str2 = "asdfМикимаус"
p str2.encoding #<Encoding:UTF-8>
p str2.scan /\p{Cyrillic}/ #found all cyrillic charachters
str2.gsub!(/\w/u,'') #removes only latin characters
puts str2
The question is why \w ignore cyrillic characters?
I have installed latest ruby package from http://rubyinstaller.org/.
Here is my output of r...
I can't find a wikipage or anthing :(. It's an encoding like unicode right? So it has it's own mapping of code points to characters?
...