accents

What do these Unicode characters (codepoints) mean in this regex?

I have the following regular expression : I figured out most of the part which is as follows : ValidationExpression="^[\u0020\u0027\u002C\u002D\u0030-\u0039\u0041-\u005A\u005F\u0061-\u007A\u00C0-\u00FF°./]{1,256}$" u0020 : SPACE u0027 : APOSTROPHE u002C : COMMA u002D : HYPHEN / MINUS u0030-\u0039\ : 0-9 u0041-\u005A : A - Z u005F : UN...

UTF-8 URI explodes Apache & mod_rewrite

I have Apache with mod_rewrite, and whenever I enter a URI with an accented character in it, Apache gives me a "Page Not Found" error. The URI is: /places/tags/Café My page encoding is UTF-8. My database connection & tables are UTF-8. My Apache DefaultCharacterSet = UTF-8. Yes, Apache has language packs, but I believe they're there for...

How can I ignore accents when comparing strings in Perl?

I have this quiz application where I match what people type with the right answer. For now, what I do is basically that : if ($input =~ /$answer/i) { print "you won"; } It's nice, as if the answer is "fish" the user can type "a fish" and be counted a good answer. The problem I'm facing is that, well, my users as I are french, an...

What is the best way to remove accents in a python unicode string?

I have a unicode string in python, and I would like to remove all the accents (diacritics). I found on the Web an elegant way to do this in Java: convert the unicode string to its long normalized form (with a separate character for letters and diacritics) remove all the characters whose unicode type is "diacritic". Do I need to inst...

SQL Query with accents from Foreign Languages

Hello, I have a simple column filled with words, many from foreign languages, I need to query based on the "English" letters, ie E, e, é, è, etc should be returned for query of "E" so école should be returned as a result which exists in the database when I query for "E" I can't really find a way to Google this, so help would be gre...

Replacing accents w/ their counterparts in AS3

How would I go on about changing éëíïñÑ (etc) to their counterparts? ie, eeiinN. I was thinking about doing regex matching against é -> é and replacing both & and acute/grave; with empty strings, but I can't seem to find an AS3 function that encodes accents to their non-numerical entities (ê and the like). I've already tried...

Unique constraint on table column

Hi, I'm having a table (an existing table with data in it) and that table has a column UserName. I want this UserName to be unique. So I add a constraint like this: ALTER TABLE Users ADD CONSTRAINT [IX_UniqueUserUserName] UNIQUE NONCLUSTERED ([UserName]) Now I keep getting the Error that duplicate users exist in this table. But I ha...

How can I do a accent insensitive search in Postgres 8.3.x with a DB in utf-8?

Tried select to_ascii('capo','LATIN1'), to_ascii('çapo','LATIN1') and the results are different.... ...

Accents in file name using Java on Solaris

I have a problem where I can't write files with accents in the file name on Solaris. Given following code public static void main(String[] args) { System.out.println("Charset = "+ Charset.defaultCharset().toString()); System.out.println("testéörtkuoë"); FileWriter fw = null; try { fw = new FileWriter("testéörtk...

Get CSV Data from Clipboard (pasted from Excel) that contains accented characters

SCENARIO My users will copy cells from Excel (thus placing it into the clipboard) And my application will retrieve those cells from the clipboard THE PROBLEM My code retrieves the CSV format from the clipboard However, the if the original Excel content contains characters like ä (a with umlaut) then retrieved CSV string doesn't hav...

javascript : remove accents in strings

How do I remove accentuated characters from a string ? Especially in IE6, I had something like this : accentsTidy = function(s){ var r=s.toLowerCase(); r = r.replace(new RegExp(/\s/g),""); r = r.replace(new RegExp(/[àáâãäå]/g),"a"); r = r.replace(new RegExp(/æ/g),"ae"); ...

YUI Compression? gui? and comptible with european letterS?

Hi there, I am trying to find some sort of gui or batch utility where i can YUI compress a JS file i have.. I have a utility that sort of consolidates all my js into 1 single js .. and works great but i need to compress this file.. I was using something similar before to compress but it failed on european character i.e. character with ...

Converting Symbols, Accent Letters to English Alphabet.

Dear friends, The problem is that, as you know, there are thousands of characters in the Unicode chart and I want to convert all the similar characters to the letters which are in English alphabet. For instance here are a few conversions: ҥ->H Ѷ->V Ȳ->Y Ǭ->O Ƈ->C tђє Ŧค๓เℓy --> the Family ... and I saw that there are more than 20 v...

Accent Insensitive ordering in Sphinx

I am using Sphinx with the Thinking Sphinx plugin to search my data. I am using MySQL. My data contains accented chars ("á", "é", "ã") and I want them to be equivalent to their non-accented counterparts ("a", "e", "a", for example) when searching and ordering. I got the search working using a charset table (pastie.org/204316), and a se...

Weird problem generating an XLS file and MAC

Hi, we have a strange problem with a web application that is generating some reports in CSV format. The report contains words with non-english chars (á, ñ, etc). The strange thing is when the report has 200 rows or more, everything works ok in MAC, when we generate a filtered report with for example 20 rows all the non-english chars are...

mediawiki API & encoding

I'm using the mediawiki Api to update some pages with an experimental robot. This robot uses the java apache http-client library to update the pages. (...) PostMethod postMethod = new PostMethod("http://mymediawikiinstallation/w/api.php"); postMethod.addParameter("action","edit"); postMethod.addParameter("title",page.replace(' ', '_'));...

Tomcat : French accents in a solaris directory

One of our client bought a publicity in a newspaper and added to his URL : http://www.website.com/publicité instead of "publicite" (without the accent)... I'm trying to make the corresponding directory under Solaris and it doesn't seems to work. I grabbed the "get" request and it looks like the "real" request is /publicit%C3%A9 We tried...

Actuate Report issue with accent characters

Hi.. I have a report which is in French language. When i try and run the actuate report from my workstation (erdpro designer), the data values with accent characters appear just fine when viewed in the roi. When the report is run from the iServer,lables are appearing fine , as i have a resource file defined to display the lablels. Howe...

sIFR 3 cut off uppercase charecters - work around?

Hello everybody, I have been looking for help for hours now an I cannot solve my problem. I have implemented sIFR3 in a website for headlines, but the uppercase characters such as "S" and "C" or special characters lieke "Ä" or "Ü" for example are cut off. I found a workaround tutorial here: http://blog.unity.fr/articles/sifr-accented-c...

Why does string.Compare seem to handle accented characters inconsistently?

If I execute the following statement: string.Compare("mun", "mün", true, CultureInfo.InvariantCulture) The result is '-1', indicating that 'mun' has a lower numeric value than 'mün'. However, if I execute this statement: string.Compare("Muntelier, Schweiz", "München, Deutschland", true, CultureInfo.InvariantCulture) I get '1', ind...