tags:

views:

2816

answers:

5

Hi, everyone, i've got below function to return true if input is badword

public bool isAdultKeyword(string input)
{
    if (input == null || input.Length == 0)
    {
        return false;
    }
    else
    {
        Regex regex = new Regex(@"\b(badword1|badword2|anotherbadword)\b");
        return regex.IsMatch(input);
    }
}

above function only matched to whole string i.e if input badword it wont match but it will when input is bawrod1.

what im trying to do it is get match when part of input contains one of the badwords

+2  A: 

So under your logic, would you match as to ass?

Also, remember the classic place Scunthorpe - your adult filter needs to be able to allow this word through.

ck
Good one "Scunthorpe" do you live there or why did you know that city town? :P
Petoj
inputs will be subdomain names (single words), i assumed "as" wouldnt match as because badword list wont have "as" but "ass"
@Petoj - http://en.wikipedia.org/wiki/Scunthorpe_Problem
Alan Moore
+1  A: 

You probably don't have to do it in such a complex way but you can try to implement Knuth-Morris-Pratt. I had tried using it in one of my failed(totally my fault) OCR enhancer modules.

renegadeMind
A: 

Is \b the word boundary in a regular expression?

In that case your regular expression is only looking for entire words. Removing these will match any occurances of the badwords including where it has been included as part of a larger word.

Regex regex = new Regex(@"(bad|awful|worse)", RegexOptions.IgnoreCase);
Richard Lennox
A: 

Try:

Regex regex = new Regex(@"(\bbadword1\b|\bbadword2\b|\banotherbadword\b)"); 
return regex.IsMatch(input);
Mags
A: 

SilentGhost, your method seems to be working fine. Can you clarify what wrong with it? My tester program below shows it passing a number of tests with no failures.

using System;
using System.Text.RegularExpressions;

namespace CSharpConsoleSandbox {
  class Program {
    public static bool isAdultKeyword(string input) {
      if (input == null || input.Length == 0) {
        return false;
      } else {
        Regex regex = new Regex(@"\b(badword1|badword2|anotherbadword)\b");
        return regex.IsMatch(input);
      }
    }

    private static void test(string input) {
      string matchMsg = "NO : ";
      if (isAdultKeyword(input)) {
        matchMsg = "YES: ";
      }
      Console.WriteLine(matchMsg + input);
    }

    static void Main(string[] args) {
      // These cases should match
      test("YES badword1");
      test("YES this input should match badword2 ok");
      test("YES this input should match anotherbadword. ok");

      // These cases should not match
      test("NO badword5");
      test("NO this input will not matchbadword1 ok");
    }
  }
}

Output:

YES: YES badword1
YES: YES this input should match badword2 ok
YES: YES this input should match anotherbadword. ok
NO : NO badword5
NO : NO this input will not matchbadword1 ok
Mike Clark