views:

73

answers:

3

Hi, I currently have a coldfusion regex that checks whether a string is alphanumeric or not. I would like to open that up a bit more to allow period and underscore characters. How would I modify this to allow that?

<cfset isValid= true/>
<cfif REFind("[^[:alnum:]]", arguments.stringToCheck, 1) GT 0>
 <cfset isValid= false />
</cfif>

Thanks

A: 

Would this work for you?

refind("[\w\d._]","1234abcd._")
Masterbuddha
The `\w` already includes `\d` and `_`, and you've missed that Cheeky is negating the group (in order to identify invalid strings).
Peter Boughton
Ah you are right Peter. Thanks for noticing that.
Masterbuddha
+1  A: 

This should do it.

<cfset isValidString= true/>
    <cfif REFind("[^[:alnum:]_\.]", arguments.stringToCheck, 1) GT 0>
    <cfset isValidString= false />
</cfif>

Also using "isValid" for a variable name is not a great practice. It is the name of a function in ColdFusion and could cause you issues someday.

Jason Dean
Thanks Jason - that works like a bomb. At least I can also add additional characters now if need be using the \ character.Regarding 'isValid' - you are 100% correct. It was originally 'isAlphaNumeric' but I thought that wouldn't be in context anymore for the requirements of this example, so i changed it for this example online without thinking!
Cheeky
You don't need \ before `.` and using `\w` will include `[A-Za-z0-9_]` and is more common than the `[[:alnum:]]` stuff, so consider simply `[^\w.]` instead.
Peter Boughton
Hi Peter. And including underscores? Would it then be [^\w._] ?
Cheeky
The `\w` includes underscore. (It'd still work if you specified it explicitly, it's just not necessary).)
Peter Boughton
+2  A: 

No need for cfif - here's a nice concise way of doing it:

<cfset isValidString = NOT refind( '[^\w.]' , Arguments.StringToCheck )/>


Alternatively, you can do it this way:

<cfset isValidString = refind( '^[\w.]*$' , Arguments.StringToCheck ) />

(To prevent empty string, change * to +)

This method can make it easier to apply other constraints (e.g. must start with a letter, etc), and is a slightly more straight-forward way of expressing the original check anyway.

Note that the ^ here is an anchor meaning "start of line/string" (with $ being the corresponding end), more information here.

Peter Boughton
Personally, I prefer to write out a-z0-9 instead of using \w because it is a more visual representation, but good explanation.
Ben Doom
Shouldn't it be `<cfset isValidString = ( refind( '[^\w.]' , Arguments.StringToCheck ) EQ 0 ) />` otherwise you are saying it is valid if it contains a character **other** than alphanumeric or period.
Jordan Reiter
Thanks Jordan, I knew something wasn't quite right, but clearly wasn't thinking straight - yes, it should have been `( x EQ 0 )` - or simply prefix the lot with `NOT` which again is simpler/clearer.
Peter Boughton
Ben, if you specifically just want lowercase and numbers, then `[a-z0-9]` is fine, but it's important to note that `\w` is `[A-Za-z0-9_]` - and when it gets to that long I'd say the `\w` gives more instant clarity.
Peter Boughton
Oh and it's useful to know that in *some* regex implementations (but not CF's), a `\w` can also include accented characters such as `áèñôü...` - if these are desired then `\w` is definitely easier, but of course if they are not then spelling out as above is required. (Except with Python, where there's a flag to go either way.)
Peter Boughton