views:

225

answers:

4

I was thinking in providing the following regex as an answer to this question, but I can't seem to write the regular expression I was looking for:

w?o?r?d?p?r?e?s?s?

This should match a ordered abbreviation of the word wordpress, but it can also match nothing at all.

How can I modify the above regex in order for it to match at least 4 chars in order? Like:

  • word
  • wrdp
  • press
  • wordp
  • wpress
  • wordpress

I'd like to know what is the best way to do this... =)

+2  A: 

What about php similarity checker functions?

erenon
Nice one, thanks!
Alix Axel
+5  A: 

You could use a lookahead assertion:

^(?=.{4})w?o?r?d?p?r?e?s?s?$
Gumbo
It seems to find abbreviations out of order like `wodr`.
Alix Axel
Alix it's not exactly clear what abbreviations are ok and which are not: why `wrdp` yes and `wodr` no?
kemp
@kemp: I'm sorry, they both should be okay because the `r` appears twice. `wodw` this one should not be matched. Sorry for the confusion.
Alix Axel
@Alix Axel: Add marks for the start and the end and it works.
Gumbo
@Gumbo: Thanks, you've a genius! =P
Alix Axel
this is cool...
Dyno Fu
+2  A: 
if ( strlen($string) >= 4 && preg_match('#^w?o?r?d?p?r?e?s?s?$#', $string) ) {
    // abbreviation ok
}

This won't even run the regexp unless the string is at least 4 chars long.

kemp
A: 

i know this is not a regex, just for fun...

#!/usr/bin/python

FULLWORD = "wordprocess"

def check_word(word):
    i, j = 0, 0
    while i < len(word) and j < len(FULLWORD):
        if word[i] == FULLWORD[j]:
            i += 1; j += 1
        else:
            j += 1

    if j >= len(FULLWORD) or i < 4 or i >= len(FULLWORD):
        return "%s: FAIL" % word
    return "%s: SUCC" % word

print check_word("wd")
print check_word("wdps")
print check_word("wsdp")
print check_word("wordprocessr")
Dyno Fu