tags:

views:

143

answers:

3

Hi,

I have this SQL condition that is supposed to retrieve all rows that satisfy the given regexp condition:

country REGEXP ('^(USA|Italy|France)$')

However, I need to add a pattern for retrieving all blank country values. Currently I am using this condition

country REGEXP ('^(USA|Italy|France)$') OR country = ""

How can achieve the same effect without having to include the OR clause?

Thanks, Erwin

+1  A: 

You could try:

country REGEXP ('^(USA|Italy|France|)$')

I just added another | after France, which should would basically tell it to also match ^$ which is the same as country = ''.

Update: since this method doesn't work, I would recommend you use this regex:

country REGEXP ('^(USA|Italy|France)$|^$')

Note that you can't use the regex: ^(USA|Italy|France|.{0})$ because it will complain that there is an empty sub expression. Although ^(USA|Italy|France)$|^.{0}$ would work.

Here are some examples of the return value of this regex:

select '' regexp '^(USA|Italy|France)$|^$'
> 1
select 'abc' regexp '^(USA|Italy|France)$|^$'
> 0
select 'France' regexp '^(USA|Italy|France)$|^$'
> 1
select ' ' regexp '^(USA|Italy|France)$|^$'
> 0

As you can see, it returns exactly what you want.

If you want to treat blank values the same (e.g. 0 spaces and 5 spaces both count as blank), you should use the regex:

country REGEXP ('^(USA|Italy|France|\s*)$')

This will cause the last row in the previous example to behave differently, i.e.:

select ' ' regexp '^(USA|Italy|France|\s*)$'
> 1
Senseful
You know, you could have just used `^$` instead of `^.{0}$` on the Right Hand Side of the outer `|`
gnarf
@gnarf: woops, I was trying so hard to avoid using look-arounds that I overlooked these simple solutions you provided. Thanks, I updated the answer.
Senseful
+4  A: 

This should work:

country REGEXP ('^(USA|Italy|France|)$')

However from a performance point of view, you may want to use the IN syntax

country IN ('USA','Italy','France', '')

The later should be faster as REGEXP can be quite slow.

Ben Rowe
I have already tried adding the extra |, but it gave me this error:"Got error 'empty (sub)expression' from regexp"
Erwin Paglinawan
When I tried changing the expression to "country REGEXP ('^(USA|Italy|France|[\s]*)$')", it worked. Any issue with this approach?@Ben btw I need to use REGEXP for this, because there are times when I need to use a pattern instead of the actual country name.
Erwin Paglinawan
@Erwin: The only issue with `\s*` is that it will match 'blank' entries such as 3 spaces. If you don't have entries like this in your database, then it should be fine.
Senseful
@eagle I'm not sure how the data in the database will look like. If not cleaned up properly, probably there will be blank data with 2 or more spaces. So if that's the case, maybe it would be safer to use the \s, right?
Erwin Paglinawan
@Erwin: it all depends on you and your specific case if you want to consider an empty string and a string with spaces as the same or not. See my updated answer below for a solution for both of these situations, and choose the one that's right for you.
Senseful
+2  A: 

There's no reason you can't use the $ (match end of string) to fill in your "empty subexpression" issue...

It looks a little weird but country REGEXP ('^(USA|Italy|France|$)$') will actually work

gnarf
+1: This is a good solution.
Senseful