tags:

views:

171

answers:

3

Could someone lead me on finding a regular expression that blocks a comma separated list of Spam words I already have?

The regular expression needs to match a string with the spam word list I already have.

Not that it matters, but I am using PHP.

+2  A: 

You could generate a regular expression that matches anything containing a spamword from you list by replacing you commas with | and adding round brackets and word boundaries.

If your spamlist is "spam1,spam2,spam3", your regular expression would be "\b(spam1|spam2|spam3)\b".

Jens
Be careful with this, as it's easy to accidentally block words such as *chardonnay* if you don't specify matching word boundaries.
Greg Hewgill
True... I'll edit in word boundaries.
Jens
A: 

You can use JavaScript to prevent the user from submitting spam data. Such as:

var spam_words = ["word1", "word2", "word3"];
var regex = new RegExp(spam_words.join("|"));

if(regex.test(form_data_you_wanna_test)){
    // stop submit
}else{
    // submit
}
Yousui
That is useless. Client side checks can easily be bypassed, always check on server side.
Qtax
+6  A: 

Try this:

 \b(word1|word2|...)\b

The \b will match between a word character and a non-word character (so that the expression won't match if the words appear as part of a longer word).

Aaron Digulla
Worked for me :) Glad an explaination for '\b' existed..
Immanuel