tags:

views:

125

answers:

5

Could somebody help me create a regex that matches only characters?

Thanks in advance!

+3  A: 

Use a character set: [a-zA-Z] matches one letter from A–Z in lowercase and uppercase. [a-zA-Z]+ matches one or more letters and ^[a-zA-Z]+$ matches only strings that consist of one or more letters only (^ and $ mark the begin and end of a string respectively).

If you want to match other letters than A–Z, you can either add them to the character set: [a-zA-ZäöüßÄÖÜ]. Or you use predefined character classes like the Unicode character property class \p{L} that describes the Unicode characters that are letters.

Gumbo
That's a very ASCII-centric solution. This will break on pretty much any non-english text.
Joachim Sauer
@Joachim Sauer: It will rather break on languages using non-latin characters.
Gumbo
Already breaks on 90% of German text, don't even mention French or Spanish. Italian might still do pretty well though.
Ivo Wetzel
@Gumbo: that depends on what definition of "latin character" you choose. J, U, Ö, Ä can all be argued to be latin characters or not, based on your definition. But they are all used in languages that use the "latin alphabet" for writing.
Joachim Sauer
+3  A: 

\p{L} matches anything that is a Unicode letter if you're interested in alphabets beyond the Latin one

RobV
not in all regex flavours. For example, vim regexes treat `\p` as "Printable character".
Philip Potter
Well in any regex engine that supports unicode regex then
RobV
[this page](http://www.regular-expressions.info/refflavors.html) suggests only java, .net, perl, jgsoft, XML and XPath regexes support \p{L}. But major omissions: python and ruby (though python has the regex module).
Philip Potter
@Philip Potter: Ruby supports Unicode character properties using that exact same syntax.
Jörg W Mittag
+1  A: 
/[a-zA-Z]+/

Super simple example. Regular expressions are extremely easy to find online.

http://www.regular-expressions.info/reference.html

Scott Radcliff
A: 

Depending on your meaning of "character":

  • [A-Za-z] --> all letters (uppercase and lowercase)
  • [^0-9] --> all non-digit characters
Molske
I meant lettters. It doesn't appear to be working though. preg_match('/[a-zA-Z]+/', $name);
Nike
[A-Za-z] is just the declaration of characters you can use. You still need to declare howmany times this declaration has to be used: [A-Za-z]{1,2} (to match 1 or 2 letters) or [A-Za-z]{1,*} (to match 1 or more letters)
Molske
A: 

[\w]* matches whatever the regex engine thinks is a word character 0 or more times

Mike Cheel
\W matches [A-Za-z0-9_] so also 0-9 and "_". Not only letters
Molske
well technically 0-9 and _ are considered characters when talking about text. What I typed is technically correct. He didn't say just alphabet. He said characters.
Mike Cheel