ansaurus

Question

Answer 1

+4 A:

You can match all the characters you want, and then join them together, like this:

original = "aøbæcå"
stripped = original.scan(/[a-zA-Z]/).to_s
puts stripped

which outputs "abc"

Magnar 2009-04-10 12:37:59

Answer 2

+8 A:

First of all, I think it might be easier to define what constitutes "correct input" and remove everything else. For example:

input = input.gsub(/[^0-9A-Za-z]/, '')

If that's not what you want (you want to support non-latin alphabets, etc.), then I think you should make a list of the glyphs you want to remove (like ™ or ☻), and remove them one-by-one, since it's hard to distinguish between a Chinese, Arabic, etc. character and a pictograph programmatically.

Finally, you might want to normalize your input by converting to or from HTML escape sequences.

Can Berk Güder 2009-04-10 12:40:13

Thanks, I think is easier to create a list of allowed characters.

Yud 2009-04-10 12:45:51

Answer 3

+2 A:

If you just wanted ASCII characters, then you can use:

original = "aøbauhrhræoeuacå" 
cleaned = ""
original.each_byte { |x|  cleaned << x unless x > 127   }
cleaned   # => "abauhrhroeuac"

Matthew Schinckel 2009-04-10 13:28:47

ansaurus

tags:

views:

answers:

How I can delete special characters?

related questions