views:

195

answers:

4

Let's say using Javascript, I want to match a string that ends with [abcde]* but not with abc.

So the regex should match xxxa, xxxbc, xxxabd but not xxxabc.

I am utterly confused.

Edit: I have to use regex for some reason, i cannot do something if (str.endsWith("abc"))

A: 

Hmm ...

var regex = /(ab[abde]|[abcde])$/; // wrong

maybe? wait no; hold on:

var regex = /(ab[abde]|([^a].|.[^b])[abcde]|\b.?[abcde])$/;

So that's "ab" followed by "a", "b", "d", or "e"; or any two-character sequence where the first character isn't "a" or the second character isn't "b", followed by "a" through "e"; or any word boundary followed by any character (possibly) followed by "a" through "e". The last clause is to deal with short strings; it's sort-of cheating but in this case it works.

Pointy
A: 

Firstly, note every string ends with [abcde]*, as that allows zero width. Thus we're really just looking for a regex that matches strings that don't end in abc. Easy.

([^c]|[^b]c|[^a]bc)$

That's something that's not c, something that's not b followed by c, or something that's not a followed by bc, and whichever option of those, then followed by the end of the string.

me_and
+2  A: 

The solution is simple: use negative lookahead:

(?!.*abc$)

This asserts that the string doesn't end with abc.

You mentioned that you also need the string to end with [abcde]*, but the * means that it's optional, so xxx matches. I assume you really want [abcde]+, which also simply means that it needs to end with [abcde]. In that case, the assertions are:

(?=.*[abcde]$)(?!.*abc$)

See regular-expressions.info for tutorials on positive and negative lookarounds.


I was reluctant to give the actual Javascript regex since I'm not familiar with the language (though I was confident that the assertions, if supported, would work -- according to regular-expressions.info, Javascript supports positive and negative lookahead). Thanks to Pointy and Alan Moore's comments, I think the proper Javascript regex is this:

var regex = /^(?!.*abc$).*[abcde]$/;

Note that this version (with credit to Alan Moore) no longer needs the positive lookahead. It simply matches .*[abcde]$, but first asserting ^(?!.*abc$).

polygenelubricants
That doesn't work. Try it in Javascript - the second regex matches "xxabc".
Pointy
@Pointy: `"xxabc".matches("(?=.*[abcde]$)(?!.*abc$).*")` is `false` in Java; I admit that I'm not familiar with any particular Javascript flavor, but if it supports positive and negative lookahead, then the regex should work.
polygenelubricants
@Pointy: `"xxabc".match(/^(?=.*[abcde]$)(?!.*abc$).*$/)` is `null` in Javascript (though I'm not sure if this is the proper way to regex in this language).
polygenelubricants
Try regex.test("xxabc") - returns boolean
Pointy
I think it works in the form you give in that comment, with the "$" at the end of the pattern outside the groups.
Pointy
I don't think the *positive* lookahead is necessary: `^(?!.*abc$).*[abcde]$`
Alan Moore
@Alan: yep, you're right =) Suggestion incorporated with full credit.
polygenelubricants
+1  A: 

Either the question is not properly defined, or everyone is overlooking a simple answer.

var re = /abc$/;

!re.test("xxxa");    // pass
!re.test("xxxbc");   // pass
!re.test("xxxabd");  // pass
!re.test("xxxabc");  // fail

All of these end in /[abcde]*/

macek