I'm trying to create a Regex usuable in C# that will allow me to take a list of single letters and/or letter groups and ensure that a word is only comprised of items from that list. For instance:
- 'a' would match 'a', 'aa', 'aaa', but not 'ab'
- 'a b' would match 'a', 'ab', 'abba', 'b', but not 'abc'
- 'a b abc' would match 'a', 'ab', 'abc', 'aabc', 'baabc', but not 'ababac'
I thought something of the form
(a|b|abc)*
would work, but it incorrectly matches the last term. Here's the code I'm testing with:
[Fact]
public void TestRegex()
{
Regex regex = new Regex("(a|b|abc)*");
regex.IsMatch("a").ShouldBeTrue();
regex.IsMatch("b").ShouldBeTrue();
regex.IsMatch("abc").ShouldBeTrue();
regex.IsMatch("aabc").ShouldBeTrue();
regex.IsMatch("baabc").ShouldBeTrue();
// This should not match ... I don't think anyway
regex.IsMatch("ababac").ShouldBeFalse();
}
I have a pretty basic understanding of regex, so apologies if I'm missing something obvious here :)
Update I don't understand why your counter-example is a counter-example : ababac = a b a bac. cCould you clarify ?
I only want to use 'a', 'b', and 'abc' - 'bac' would be a completely different term.
Let me give another example: Using 'ba' and 't', I could match the word 'bat', but not 'tab'. The order of the letters inside the letter groups is important.
(Tests with Diadistis' solution)
[Fact]
public void TestRegex()
{
Regex regex = new Regex(@"\A(?:(e|l|ho)*)\Z");
regex.IsMatch("e").ShouldBeTrue();
regex.IsMatch("l").ShouldBeTrue();
regex.IsMatch("ho").ShouldBeTrue();
regex.IsMatch("elho").ShouldBeTrue();
regex.IsMatch("hole").ShouldBeTrue();
regex.IsMatch("holle").ShouldBeTrue();
regex.IsMatch("hello").ShouldBeFalse();
regex.IsMatch("hotel").ShouldBeFalse();
}