views:

92

answers:

1

Could you tell me how to test regex-code correctly?

I test my user-login attribute with following code:

# user.rb
class User < ActiveRecord::Base
  #...
  validates_format_of :login, :with => /^[a-zA-z0-9_.]{3,18}$/
end

# user_spec.rb
describe User do
  before(:each) do 
    @user = Factory.build(:user)
    @user.save
  end

  subject { @user }
  it { should be_valid }

  it { should_not allow_value("b lah").for(:login) }
  it { should_not allow_value("bälah").for(:login) }
  it { should_not allow_value("b@lah").for(:login) }
  it { should_not allow_value("bülah").for(:login) }
  it { should_not allow_value("bßlah").for(:login) }
  it { should_not allow_value("b!lah").for(:login) }
  it { should_not allow_value("b%lah").for(:login) }
  it { should_not allow_value("b)lah").for(:login) }
  # ....
  # Shall I test here every special sign????
end

But it seems very redundant and not secure.... Is there a best practice? Thx!

+2  A: 

You're not really testing your model here, you're testing your regex. It's not quite the same thing. Also, you're testing the same aspect of your regex, that it only allows [a-zA-z0-9_.], again and again. If you want to apply different tests, test different aspects of it, eg with "lo" (<3 chars) or "12345678901234567890" (>18 chars).

Also, if you wanted to dry it up you could do something like

invalid_logins = ["b lah","bälah","b@lah","bülah","bßlah","b!lah","b%lah","b)lah"]
invalid_logins.each do |s|
  it { should_not allow_value(s).for(:login) }
end
Max Williams
Yeah, but there are about thousends of different special signs... And I don't think its secure to whitelist such signs...?
Lichtamberg
Like Max said, after a certain point you're just testing that "[a-ZA-Z0-9_.]" does what it says it does, which it always will. Test it a few times if you're *really* paranoid and then call it good.
rspeicher
yeah, regexes themselves are very reliable. When regexes produce unexpected results it's because the person writing them didn't understand the rules. In this case you can rely that [a-ZA-Z0-9_.] is going to only allow those characters. It's not going to suddenly let some new character through because it's unusual or something.
Max Williams
And, as an example of misusing regexes, we've all done it! You had [a-zA-z0-9_.] which should have been [a-zA-Z0-9_.], and rspeicher had [a-ZA-Z0-9_.] which also should have been [a-zA-Z0-9_.]. I blindly repeated rspecher's mistake in my last comment. All of these mistakes will produce unexpected results.
Max Williams
Damned shift key.
rspeicher