tags:

views:

184

answers:

7
if(strpos($string, "PENIS") != false){
 echo 'This word is not allowed';
}
if(strpos($string, "VAGINA") != false){
 echo 'This word is not allowed';
}

Okay, so I am trying to check the submit data to see if there are inappropriate words. Instead of making 5 instances, is there a more efficient way?

A: 

You could combine them as a single regular expression and then use preg_grep() to confirm their existence

Brian Barnes
+6  A: 

I'm sure there's a more clever way to do this in general.

If you just want to be more concise, then it's probably best to loop over some bad words, instead of adding repetitive, almost identical, conditionals (ifs):

<?PHP
$banned = array('bad','words','like','these');

$looksLikeSpam = false;
foreach($banned as $naughty){
    if (strpos($string,$naugty) !== false){
        $looksLikeSpam=true;
    }
}

if ($looksLikeSpam){
   echo "You're GROSS!  Just... ew!";
   die();
}

Edit: Also, note that in your question-code, you test strpos != false. You really want !==, since strpos() will return 0 if the first word, is, say, PENIS. 0 will be cast to false. See where I'm going here?

Also, you probably want to use stripos(), to be case-insensitive (unless you only care if if people SHOUT offensive words) :-)

timdev
Also, note the difference between `!== false` and `!= false` in the original question.
Matthew Scharley
Yup, thanks matt -- was just adding some clarification about that!
timdev
+1, but should it be mentioned that this a wildly naive way to go about preventing spam?
Terry Lorber
I got a question for this though, what if I wanted to add a different message for each word? A loop wouldn't work, right?e.g.IF the word is PENIS, the message would say, What's that?If the word was VAGINA, the message would say, Who is?
Doug
@Terry what would be a better way?
Doug
@Doug - then use an associative array, the keys would be the bad words, the values would hold the data, and adjust the code accordingly.
timdev
@Doug - Better ways include using a CAPTCHA (recaptcha is nice), or possibly tying into something like akismet.
timdev
@Doug: Word based blacklists are fraught with danger. What if I wanted to make a post about Great Tits? http://wikipedia.com/wiki/Great_Tits Ooops. +1 for recaptcha though.
Matthew Scharley
+2  A: 

Yes, you could make an array of badwords and build a regex out of it. This would also make handling case-insensitivity easy.

$badwords = array('staircase', 'tuna', 'pillow');
$badwords_regex = '/' . implode('|', $badwords) . '/i';

$contains_badwords = preg_match($badwords_regex, $text);
yjerem
A: 

Use an array of values and iterate over the array, checking the submitted word each time. If a match is found break out of the loop and return true.

Chazadanga
A: 

You might use PHP in_array function rather than a loop, if you're checking one word. A regex would be better if you're checking a whole sentence though.

http://us2.php.net/manual/en/function.in-array.php

$bad_word_array=array('weenis','dolt','wanker');

$passed=in_array($suspected_word,$bad_word_array);

Alex JL
+1  A: 

You need to be careful with word boundaries, or else people will complain about not being able to enter words like "shuttlecock".

I hope you (or your client) realises that automatic "naughty word" filtering does not remove the need for moderating. There are lots of ways to be offensive without using any of the supposedly naughty words. Even deciding what is or is not offensive depends on the cultural context.

Stephen C
Ah, the classic Scunthorpe problem. [http://en.wikipedia.org/wiki/Scunthorpe_problem]
Rob
+2  A: 

No, it's crap. There is a whole branch of computing science concerning string searching algorithms. Heck, Knuth even dedicated half of TAOCP Volume 3 to it.

Boyer-Moore is a good algorithm, now used in many applications involving searching for multiple needles in a haystack.

Ether
+1 for actually addressing the "efficiency" guts of the question
Rob
Thanks, I'm trying to learn beyond my scopes.
Doug