tags:

views:

71

answers:

6

Hi,

I have a word list, but it has some words like East's

I need to find the words, those only contain a-z and A-Z, from a word list. How to do that.

I am using grep. What should I put after grep

grep *** myfile.txt

Thanks!

+2  A: 

The regexp you want is ^[a-zA-Z]+$

For grep:

vinko@parrot:~$ more a.txt
Hi
Hi Dude
Hi's
vinko@parrot:~$ egrep ^[a-zA-Z]+$ a.txt
Hi

In pseudocode:

 regexp = "^[a-zA-Z]+$";
 foreach word in list
      if regexp.matches(word)
          do_something_with(word)
Vinko Vrsalovic
I am sorry I forgot I am using grep
skydoor
A: 
[a-z]+

using the case insensitive option, or

[A-Za-z]+

without the case insensitive option.

Post the data and the langage for more help.

for grep

egrep -i '^[a-z]+$' wordlist.dat

i can't remember what metachars need escaping and not if it doesn't work, try \[a-z\]\+ or any similar combination!

Paul Creasey
I am sorry I forgot I am using grep
skydoor
I tried egrep -i '[a-z]+' wordlist.dat, but it still output the words with ' sign.
skydoor
hmm ofc it would now i think about it, just add ^ and $, `^[a-z]$`, this assers that all characters match the class, ^ is the start of line char and $ is end of line.
Paul Creasey
A: 

Use fgrep if you want to match against a word list.

fgrep word_list_file myfile.txt
KennyTM
+1  A: 

The grep syntax is:

grep '^[[:alpha:]]\+$' input.txt

Documentation for grep's pattern syntax is here.

Mark Byers
A: 

GNU grep

grep -wEo "[[:alpha:]]+" file
ghostdog74
A: 

Or filter out all words that contain funnies

grep -v '[^a-zA-Z]'
Is there a prize for the shortest answer? :)

Note that there are portability differences between [[:alpha:]] and [A-Za-z]. [A-Za-z] works in more versions of grep, but [[:alpha:]] takes account of wide character environments and internationalization (accented characters for example when they are included in the locale).

martinwguy