tags:

views:

130

answers:

3

I have a txt file and each line has a set of random strings, I need to select only the lines that contains only the characters: 1234567890qwertyuiopasdfghjklzxcvbnm-._~

I'm reading line by line and verifying char by char, I don't think it's the best way to do that, and I think a RegEx would be perfect.

So can anybody help with with a pattern for this?

thanks!

+8  A: 
 /^[-0-9a-z._~]*$/

 ^       :: matches the start of a string/start of a line
 [       :: start of a character class, means match any one of the contained characters
 -       :: dash has a special meaning in character classes, so to avoid having it interpreted for its special meaning, list it first
 0-9     :: shorthand for 0123456789 in a character class
 a-z     :: shorthand for abcdefghijklmnopqrstuvwxyz in a character class
 ._~     :: means exactly these characters in a character class
 ]       :: end of the character class 
 *       :: match zero or more of the previous atom (in this case, the character class)
 $       :: matches the end of a string/end of a line
rampion
+2  A: 

If I understand you well, you could go with this:

/^([-0-9a-zA-Z._~]\n)+$/

It is case-insensitive, check for end of line and doesn't match an empty line.

eKek0
If he's processing the data line by line, there shouldn't be any linefeeds. Also, your regex will match an empty line; the OP didn't say anything about that, but I would have used '+' instead of '*' as a first cut.
Alan Moore
A: 

It doesn't take much to realize that the regular expression should be equivalent to ^[1234567890qwertyuiopasdfghjklzxcvbnm\-._~]*$. From there you can narrow it down trivially, replacing with 0-9, a-z, etc. . You should learn the basics of regular expressions before using answers you get from others.

Devin Jeanpierre