views:

169

answers:

2

I'm trying to write an SQL query that would search within a CSV (or similar) array in a column. Here's an example:

insert into properties set bedrooms = 1,2,3 (or 1-3) title = nice property price = 500

I'd like to then search where bedrooms = 2+. Is this even possible?

+3  A: 

The correct way to handle this in SQL is to add another table for a multi-valued property. It's against the relational model to store multiple discrete values in a single column. Since it's intended to be a no-no, there's little support for it in the SQL language.

The only workaround for finding a given value in a comma-separated list is to use regular expressions, which are in general ugly and slow. You have to deal with edge cases like when a value may or may not be at the start or end of the string, as well as next to a comma.

SELECT * FROM properties WHERE bedrooms RLIKE '[[:<:]]2[[:>:]]';

There are other types of queries that are easy when you have a normalized table, but hard with the comma-separated list. The example you give, of searching for a value that is equal to or greater than the search criteria, is one such case. Also consider:

  • How do I delete one element from a comma-separated list?
  • How do I ensure the list is in sorted order?
  • What is the average number of rooms?
  • How do I ensure the values in the list are even valid entries? E.g. what's to prevent me from entering "1,2,banana"?

If you don't want to create a second table, then come up with a way to represent your data with a single value.


More accurately, I should say I recommend that you represent your data with a single value per column, and Mike Atlas' solution accomplishes that.

Bill Karwin
Thanks for the thorough response! One thing I should've been more clear about in the question is that what I really need to search against is a numerical range, meaning, I don't really need to know what's between 1 and 5 because I know it'll be 2, 3 and 4.So really the question is, is there to build some sort of low/high query against a range stored as 1-3?
Ian Silber
Just have two columns. MinBedroom and MaxBedroom. WHERE MinBedroom > 1 AND MaxBedroom < 3.
Mike Atlas
Yes, +1 to Mike Atlas. It's like when you design a table for temporal ranges, you have a `start` column and a `finish` column and then you can do queries easily.
Bill Karwin
+2  A: 

Generally, this isn't how you should be storing data in a relational database.

Perhaps you should have a MinBedroom and MaxBedroom column. Eg:

SELECT * FROM properties WHERE MinBedroom > 1 AND MaxBedroom < 3;
Mike Atlas
+1 for the suggestion in your comment for `MinBedroom` and `MaxBedroom` columns. Please copy that into this answer! :)
Bill Karwin
Thanks Bill :) Edited to contain this as the answer.
Mike Atlas