views:

267

answers:

2

I have an array full of user logins that was loaded from the database. What's the simplest and efficient way to keep only the logins that contain non-ascii characters?

logins = Users.find(:all).map{|user|user.login}
logins_with_non_ascii_characters = logins.select{ |login| ...??? }

Thanks

Edit: if you have a SQL solution (I use MySQL, but a generic solution would be better) to filter out the logins directly on the first line, with a :conditions clause, I'm ok with that too. In fact, it would be way more efficient:

logins = Users.find(:all, :conditions => "...???").map{|user|user.login}
A: 

All I have found so far is this:

def is_ascii(str)
    str.each_byte {|c| return false if c>=128}
    true
end

logins = Users.find(:all).map{|user|user.login}
logins_with_non_ascii_characters = logins.select{ |login| not is_ascii(login) }

It's a bit disappointing, and certainly not efficient. Anyone got a better idea?

MiniQuark
Does it have to be efficient? It sounds like this is a one-time operation.
John Topley
@John: Good point, it doesn't *have* to be efficient. I just prefer efficient solutions, and I'm sure it would help me understand Ruby a bit better. But in this case, this solution worked fine.
MiniQuark
+2  A: 

You can abuse Ruby's built in regular expression character classes for this

[:print:] contains all ASCII printable characters. It doesn't contain ASCII characters like beeps or, importantly, multibyte characters.

Working on the assumption that your users are unlikely to have ASCII BEEP as a character in their password,

#reject if has non-ascii character
valid_users = users.reject! {|user| user.login =~ /[^[:print:]]/}

should do it for you.

Patrick McKenzie