ansaurus

Question

Regex which ensures no character is repeated

Answer 1

+2 A:

This cannot be done via regular expressions, because they are context-free. You need at least context-sensitive grammar language, so only way how to achieve this is by writing the function by hand.

See formal grammar for background theory.

Yossarian 2010-08-25 11:59:21

At least not with a single pattern.

ApoY2k 2010-08-25 12:00:40

It can, with an infinitely long regular expression :)

BoltClock 2010-08-25 12:01:03

yes since the length cannot be >26, the resulting regex that way would be awfully long i guess assuming I will be backreferencing all previous groups :P

mishal153 2010-08-25 12:04:16

@Mark, then they are not regular expressions.

Yossarian 2010-08-25 12:06:06

@BoltClock with finite alphabet (as upper case characters are)it's finite. For example: we have 26 uppercase characters, and each one can appear only once. Then we have 26! possible 26-character-long words, a bit more 25-char-long words etc. At the end we can just construct long (but finite) regexp by union of all words.So it can be done with regexp, but not practical in this case.

phadej 2010-08-25 12:08:09

@phadej: yeah, I wrote that in an exaggerative way ;)

BoltClock 2010-08-25 12:10:30

Answer 2

+2 A:

Why not check for a character which is repeated or not in uppercase instead ? With something like ([A-Z])?.*?([^A-Z]|\1)

Colin Hebert 2010-08-25 12:00:32

I guess that would also serve the purpose. How do i do that using regex ?

mishal153 2010-08-25 12:02:52

Check with this regex. If it doesn't match, then your condition is true.

Colin Hebert 2010-08-25 12:07:34

ok. I tried it and its working fine. But i have used a non-regex approach eventually.

mishal153 2010-08-25 14:11:49

You're right, it will be faster.

Colin Hebert 2010-08-25 14:13:37

Answer 3

+4 A:

You can do this with .NET regular expressions although I would advise against it:

string s = "ABCD";
bool result = Regex.IsMatch(s, @"^(?:([A-Z])(?!.*\1))*$");

Instead I'd advise checking that the length of the string is the same as the number of distinct characters, and checking the A-Z requirement separately:

bool result = s.Cast<char>().Distinct().Count() == s.Length;

Alteranatively, if performance is a critical issue, iterate over the characters one by one and keep a record of which you have seen.

Mark Byers 2010-08-25 12:03:04

Marking this a right answer since this is what i have used eventually :)

mishal153 2010-08-25 14:10:54

Answer 4

+1 A:

Use negative lookahead and backreference.

  string pattern = @"^(?!.*(.).*\1)[A-Z]+$";
  string s1 = "ABCDEF";
  string s2 = "ABCDAEF";
  string s3 = "ABCDEBF";
  Console.WriteLine(Regex.IsMatch(s1, pattern));//True
  Console.WriteLine(Regex.IsMatch(s2, pattern));//False
  Console.WriteLine(Regex.IsMatch(s3, pattern));//False

\1 matches the first captured group. Thus the negative lookahead fails if any character is repeated.

Amarghosh 2010-08-25 12:05:06

Answer 5

A:

This isn't regex, and would be slow, but You could create an array of the contents of the string, and then iterate through the array comparing n to n++

=Waldo

gWaldo 2010-08-25 12:10:17

Answer 6

A:

It can be done using what is call backreference.

I am a Java program so I will show you how it is done in Java (for C#, see here).

final Pattern aPattern = Pattern.compile("([A-Z]).*\\1");
final Matcher aMatcher1 = aPattern.matcher("ABCDA");
System.out.println(aMatcher1.find());
final Matcher aMatcher2 = aPattern.matcher("ABCDA");
System.out.println(aMatcher2.find());

The regular express is ([A-Z]).*\\1 which translate to anything between 'A' to 'Z' as group 1 ('([A-Z])') anything else (.*) and group 1.

Use $1 for C#.

Hope this helps.

NawaMan 2010-08-25 12:11:20

ansaurus

tags:

views:

answers:

Regex which ensures no character is repeated

related questions