views:

115

answers:

1

Hello all,

I'm not a Ruby programmer, but as I was reading through the extensive RoR security guide I noticed this section:

http://guides.rubyonrails.org/security.html#regular-expressions

A common pitfall in Ruby’s regular expressions is to match the string’s beginning and end by ^ and $, instead of \A and \z.

Does anyone know if this is this just a matter of aesthetics or something else? I ask because I've only used languages that use the ^ and $.

Anyway...thanks in advance.

+9  A: 

This isn't specific to Ruby; \A and \Z are not the same thing as ^ and $. ^ and $ are the start and end of line anchors, whereas \A and \Z are the start and end of string anchors.

Ruby differs from other languages in that it automatically uses "multiline mode" (which enables the aforementioned behaviour of having ^ and $ match per line) for regular expressions, but in most other flavours you need to enable it yourself, which is probably why that article contains the warning.

Reference: http://www.regular-expressions.info/anchors.html

Daniel Vandersluis
Hi Daniel...thanks for the response! After reading it, I searched around a bit for more info on the difference between end of line and end of string anchors and found those sources to also use \A and \Z. This leaves me with one last point of confusion...does it matter if it's an upper or lowercase Z? The RoR site seems to be using \z. Thanks again!
treeface
@treeface: They are different. `\z` anchors at the end of the string, `\Z` anchors at the end of the string or before the last newline, if the string ends with a newline. So, if the string ends with a newline, `\Z` anchors before that last newline and `\z` anchors after.
Jörg W Mittag
@Jörg I wasn't aware of that, thanks for the info :)
Daniel Vandersluis
Thanks a lot, guys! Very useful information.
treeface