tags:

views:

301

answers:

5

Hello,

I want to allow only entered data from the English alphabet and from alphabet from Germany

like öäü OR France like áê or Chinese like ...

How can I configure my Regex so it accepts all alphabetical chars from internal alphabet?

+1  A: 

With PCRE it would be \w, a "word" character.It also accepts unicode when configured properly.

WoLpH
`\w` is not a boundary but the character class of word characters.
Gumbo
... and `\b` is the word boundary.
KennyTM
Indeed, I have modified my original answer. My explanation was incorrect.
WoLpH
A: 

It varies. Some languages have a "Unicode" flag which extend \d, \w, etc. Some support equivalence classes in a range, e.g. [[=e=]] matches e, é, ê, etc. The regex documentation for your language or library will explain what options are available.

Ignacio Vazquez-Abrams
+1  A: 

This may be a good place to start

Unicode:
http://www.regular-expressions.info/unicode.html

Regex language flavors:
http://www.regular-expressions.info/refflavors.html

leson
+2  A: 

Since you specifically ask for Unicode, \p{L} is the shortcut for a Unicode letter. Not all regex flavors support this syntax, though. .NET, Perl, Java and the JGSoft regex engine will, Python won't, for example.

So, for example \b\p{L}+\b will match an entire word of Unicode characters.

Tim Pietzcker
A: 

In a lot languages, you can simply enter the unicode symbols into the character class: [a-zäöüß] etc.

poke
That won't help a lot, when he wants to match **all** letters.
Joachim Sauer