tags:

views:

113

answers:

3

I found all sorts of really close answers already, but not quite.

I need to look at a string, and find any character that is used more than 3 times. Basically to limit a password to disallow "mississippi" as it has more than 3 s's in it. I think it only needs to be characters, but should be unicode. So I guess the (:alpha:) for the character set to match on.

I found (\w)\1+{4,} which finds consecutive characters, like ssss or missssippi but not if they are not consecutive.

Working my way through the other regex questions to see if someone has answered it but there are lots, and no joy yet.

+1  A: 
(\w)(.*\1){2,}

Match a "word character", then 2 copies of "anything, then the first thing again". Thus 3 copies of the first thing, with anything in between.

ephemient
Use `[^\W\d_]` to get letters only.
Greg Bacon
+2  A: 

This should do it:

/(.)(.*\1){3}/

It doesn't make any sense to try to combine this with checking for allowable characters. You should first test that all characters are allowable characters and then run this test afterwards. This is why it's OK to use '.' here.

It will be slow though. It would be faster to iterate once over the string and count the characters. Although for your purpose I doubt it makes much difference since the strings are so short.

Mark Byers
Both of these don't make sure that the character which is searched for is not repeated consecutively...
Franz
@Franz: Unless I'm reading the question wrong, he doesn't care about them being consecutive or not.
Mark Byers
Hmmmm... re-reading it again I thought the same, but the title keeps me thinking...
Franz
I think he means 'not necessarily consecutively'. But I could be wrong.
Mark Byers
Yes, they do. But, the one in this answer, does not work if there is any text after the last occurrence of the character in question. See my solution in the other post.Remember, .* Will match any of (0..n) characters, including 0...So, The pattern above will match both "aaaa" and "aasafjafa".
Erik A. Brandstadmoen
Erik: Did you test it? It wil work. This is a *searching* regex. Notice I don't have ^ or $, so it can match anywhere in the string.
Mark Byers
+1  A: 
.*(\w).*\1.*\1.*\1.*

This will match on a string which has any number of characters, then a certain character, and the same character repeated three times after that (total of four), with any number of characters (0..n) in between. That's what you want, right?

Test it on e.g. http://www.regexplanet.com/simple/index.html

This regex matches e.g. "mississippi" (>3 s'es) and "twinkle twinkle little star" (> 3 t's)

Erik A. Brandstadmoen