views:

273

answers:

5

Consider searching a table that contains Apartment Rental Information: A client using the interface selects a number of criteria that are represented as bit fields in the DB, for instance:

  • AllowsPets
  • HasParking
  • HasDeck
  • ModernKitchen

etc..

We are facing a situation where each new client of our software has additional fields they want to allow their end users to search on. The number of bit fields could reach into the hundreds.

I have three approaches that I'm considering and hoping for input and/or a different approach.

  • Current approach: Add more bit fields, sql queries are built dynamically and executed using EXEC: SET @SQL = @SQL + 'l.[NumUnits],' exec(@SQL))

Continue to add more bit fields. (table with 300 columns?)

  • Represent the data as a series of bits in one field. I'm unclear on if this approach will work, consider the 4 sample bit fields I offered above. The field could look like this: 1011 which would indicate false for 'hasparking' but true for all others. What I'm unclear on is how you would structure a query where you didn't care if it was false or true, for instance 1?11 where the person searching needs 1,3 and 4 to be true but doesn't care if 'HasParking' is true or false.

  • Move to an Attribute based approach where you have a table 'AttributeTypeID' and a table PropertyAttributes, which joins the PropertyID to the AttributeTypeId, new bit fields are simply a row in the AttributeTypeID table.

some other approach? Is this a well known SQL design pattern?

Thanks for any help

KM- EDIT PER COMMENTS


attribute table has a few other rows in it and is called listingattributes

CREATE TABLE [dbo].[ListingAttributes](
    [ListingID] [bigint] NOT NULL,
    [AttributeID] [int] IDENTITY(1,1) NOT NULL,
    [AttributeType] [smallint] NOT NULL,
    [BoardID] [int] NOT NULL,
    [ListingMLS] [varchar](30) NOT NULL,
    [PropertyTypeID] [char](3) NOT NULL,
    [StatusID] [varchar](2) NOT NULL,
PRIMARY KEY CLUSTERED 
(
    [AttributeID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]





;WITH GetMatchingAttributes AS
(
SELECT
    ListingID,COUNT(AttributeID) AS CountOfMatches
    FROM ListingAttributes

    WHERE 
    BoardID = 1
    AND
    StatusID IN ('A')
    AND
    --PropertyTypeID in (select * from @PropertyType)
    --AND
    AttributeType IN (2,3,6)
    GROUP BY ListingID
    HAVING COUNT(AttributeID)=(3)
)

SELECT 
    count(l.listingid)  
    FROM Listing l
        INNER JOIN GetMatchingAttributes m ON l.ListingID=m.ListingID
    --   where
    --   StatusID IN (select * from @Status)
    --AND
    --PropertyTypeID in (select * from @PropertyType)     1           1           0           NULL                           NULL                           1                                                                                                                                                                                                                                                                                                                                                                                                                                                                          NULL                                                                                                                                                                                                         1             NULL          NULL          NULL        0.1934759        NULL                                                                                                                                                                                                         NULL     SELECT                                                           0        NULL
  |--Compute Scalar(DEFINE:([Expr1006]=CONVERT_IMPLICIT(int,[Expr1012],0)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     1           2           1           Compute Scalar                 Compute Scalar                 DEFINE:([Expr1006]=CONVERT_IMPLICIT(int,[Expr1012],0))                                                                                                                                                                                                                                                                                                                                                                                                                     [Expr1006]=CONVERT_IMPLICIT(int,[Expr1012],0)                                                                                                                                                                1             0             0.001483165   11          0.1934759        [Expr1006]                                                                                                                                                                                                   NULL     PLAN_ROW                                                         0        1
       |--Stream Aggregate(DEFINE:([Expr1012]=Count(*)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        1           3           2           Stream Aggregate               Aggregate                      NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                       [Expr1012]=Count(*)                                                                                                                                                                                          1             0             0.001483165   11          0.1934759        [Expr1012]                                                                                                                                                                                                   NULL     PLAN_ROW                                                         0        1
            |--Filter(WHERE:([Expr1005]=(3)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   1           4           3           Filter                         Filter                         WHERE:([Expr1005]=(3))                                                                                                                                                                                                                                                                                                                                                                                                                                                     NULL                                                                                                                                                                                                         2471.109      0             0.00440886    9           0.1919928        NULL                                                                                                                                                                                                         NULL     PLAN_ROW                                                         0        1
                 |--Compute Scalar(DEFINE:([Expr1005]=CONVERT_IMPLICIT(int,[Expr1011],0)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      1           5           4           Compute Scalar                 Compute Scalar                 DEFINE:([Expr1005]=CONVERT_IMPLICIT(int,[Expr1011],0))                                                                                                                                                                                                                                                                                                                                                                                                                     [Expr1005]=CONVERT_IMPLICIT(int,[Expr1011],0)                                                                                                                                                                9185.126      0             0.01422281    11          0.1875839        [Expr1005]                                                                                                                                                                                                   NULL     PLAN_ROW                                                         0        1
                      |--Stream Aggregate(GROUP BY:(.[dbo].[ListingAttributes].[ListingID]) DEFINE:([Expr1011]=Count(*)))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             1           6           5           Stream Aggregate               Aggregate                      GROUP BY:(.[dbo].[ListingAttributes].[ListingID])                                                                                                                                                                                                                                                                                                                                                                                                                [Expr1011]=Count(*)                                                                                                                                                                                          9185.126      0             0.01422281    11          0.1875839        [Expr1011]                                                                                                                                                                                                   NULL     PLAN_ROW                                                         0        1
                           |--Index Seek(OBJECT:(.[dbo].[ListingAttributes].[_dta_index_ListingAttributes_BoardID_ListingID__AttributeType_PropertyTypeID_StatusID_6_7]), SEEK:(.[dbo].[ListingAttributes].[BoardID]=(1)),  WHERE:(.[dbo].[ListingAttributes].[StatusID]='A' AND (.[dbo].[ListingAttributes].[AttributeType]=(2) OR .[dbo].[ListingAttributes].[AttributeType]=(3) OR .[dbo].[ListingAttributes].[AttributeType]=(6))) ORDERED FORWARD)                                                                                                                             1           7           6           Index Seek                     Index Seek                     OBJECT:(.[dbo].[ListingAttributes].[_dta_index_ListingAttributes_BoardID_ListingID__AttributeType_PropertyTypeID_StatusID_6_7]), SEEK:(.[dbo].[ListingAttributes].[BoardID]=(1)),  WHERE:(.[dbo].[ListingAttributes].[StatusID]='A' AND (.[dbo].[ListingAttributes].[AttributeType]=(2) OR .[dbo].[ListingAttributes].[AttributeType]=(3) OR .[dbo].[ListingAttributes].[AttributeType]=(6))) ORDERED FORWARD  .[dbo].[ListingAttributes].[ListingID], .[dbo].[ListingAttributes].[AttributeID], .[dbo].[ListingAttributes].[AttributeType], .[dbo].[ListingAttributes].[StatusID]  16050.41      0.09677318    0.0315279     26          0.1283011        .[dbo].[ListingAttributes].[ListingID], .[dbo].[ListingAttributes].[AttributeID], .[dbo].[ListingAttributes].[AttributeType], .[dbo].[ListingAttributes].[StatusID]  NULL     PLAN_ROW                                                         0        1

(7 row(s) affected)



+2  A: 

I suggest you go with the second approach, known as Entity-attribute-value model. It's probably the only approach that will scale as you need.

You could also have two searches, the basic and the advanced. You keep the attributes for the basic search in one table, and all the advanced attributes in the other table. This way at least the basic search will remain rapid as the number of attributes will grow with time.

Developer Art
@Tom: Otherwise, you'll be back to ask us questions like this: http://stackoverflow.com/questions/1494535/parsing-a-dash-delimited-value-in-a-sql-table-and-doing-lookups
OMG Ponies
I just did some quick research on EAV and it has a fairly bad reputation from what I found over the past 20 minutes.. The queries are brutal, 20,30 inner joins....
Tom DeMille
@Tom, the joins on those queries are typically due to a misunderstanding of how to write the sql.
Chris Lively
so how would one write a query to select properties where hasparking = true, hasdeck = true, allowspets = false using EAV approach?
Tom DeMille
@Tom, to clarify, when using an EAV model your search queries should dynamically create pivot tables to run regular select's against. This eliminates all of the joins. If necessary, you can store the dynamically created pivot table code as a view. When a new attribute is added, just invalidate and recreate the view.
Chris Lively
this seems really hopeful, does anyone have any links to articles on creating pivot tables from EAV, I have been searching and can't find anything particularly helpful
Tom DeMille
Tom - I give an example of how to handle queries in the nested EAV approach below. PIVOTs are one way around this but keep in mind that PIVOTs are pretty expensive as well compared with normal queries. As I mention, I still use them in certain circumstances but, having been there, I use approach #1.
Mark Brittingham
Also, I strongly disagree with the assertion that an EAV model is the only one that will "scale as you need." This makes no sense at all to me given that subqueries, refining queries and pivots are all much more expensive than simple record searches. BTW - if you doubt my experience with PIVOTS, please see http://stackoverflow.com/questions/651936/how-can-i-improve-this-mailing-address-sql-server-select-statement/652065#652065
Mark Brittingham
Mark = seems you are right with the speed, see my comments below, dynamic sql with EXEC is a lot faster (with all the rows in one table)
Tom DeMille
A: 

Create a table that stores attributes or search columns based on the apartment. Definitely do not keep adding more bit field columns..nightmare maintenance and nightmare coding. And definitely don't please don't dynamically generate where statements and use exec.

JonH
A: 

I've never tested this, but what if you were to create a varchar(256) fields that stored all of your flags as one long string of 0's and 1's.

For example,

  • AllowsPets = 1
  • HasParking = 0
  • HasDeck = 1
  • ModernKitchen = 1

would be:

  • PropertyFlags = 1011

and if you were looking for something that AllowsPets and HasDeck, then the search query would look something like this:

WHERE PropertyFlags LIKE '1_1_' (the underscore represents a single wildcard character in the like clause)

this would solve your issues with adding additional columns to the search in the future, but I'm not sure how this would do performance-wise.

has anyone out there tried anything similar to this?

The problem here is that it's pretty easy to lose the relationship between the property flag and the positioning of the value. Also, it makes extending it for each client a nightmare.
Chris Lively
A: 

I've walked down this path a few times trying to store health status markers!

When I first started (in 2000?) I tried a character position approach (your #2) and found that it quickly became pretty unwieldy as I wrestled with the same questions over and over: "which position held 'Allows Pets' again?" or, worse yet, "how long is this string now? / which position am I on?" Can you work around this problem - developing objects to manage things for you? Well, yes, to an extent. But I really didn't appreciate how much extra work it cost compared to having the field identities managed for me by the database.

The second time around, I used an attribute/value pair approach similar to your solution #3. This basically worked and, for specialty needs, I still generate attribute/value pairs using a PIVOT. Also, my background is in AI and we used attribute/value pairs all the time in mechanical theorem proving so this was very natural for me.

However, there is a huge problem with this approach: pulling any one fact out ("Show me the apartments that allow pets") is easy but pulling all of the records meeting multiple constraints quickly gets very, very ugly (see my example below).

SO...I ended up adding fields to a table. I understand the theoretical reasons that Jon and 'Unknown' and 'New In Town' give for preferring other approaches and I'd have agreed with either or both at one point. But experience is a pretty harsh teacher...

A Couple More Things

First, I disagree that adding more bit fields is a nightmare of maintenance - at least compared with a character-bit approach (your #2). That is, having a distinct field for each attribute ensures that there is no 'management' necessary to figure out which slot belongs to which attribute.

Second, having 300 fields isn't really the problem - any decent database can do that without problem.

Third, your real issue and the source of pain is really the matter of dynamically generating your queries. If you are like me, this question is really all about "Do I really have to have this massive, grody and inelegant chain of "IF" statements to construct a query?"

The answer, unfortunately, is Yes. All three of the approaches you suggest will still boil down to a chain of IF statements.

In a database bit-field approach, you'll end up with a series of IF statements where all of your columns have to be added like so:

string SQL = "Select X,Y,Z Where ";

if (AllowsPets == 0)
  SQL += "(AllowsPets = 0) AND ";
else if (AllowsPets == )
  SQL += "(AllowsPets = 1) AND ";  // Else AllowsPets not in query
.
.
.
SQL = SQL.Substring(SQL.Length - 4);  // Get rid of trailing 'AND' / alternatively append '(1=1)'

In a character-position approach, you'll do the same thing but your "Appends" will add "0", "1" or "_" to your SQL. You'll also, of course, run into the maintenance issues deciding which one is which that I discussed above (enums help but don't completely solve the problem).

As mentioned above, the Attribute-Value approach is actually the worst. You'll have to either create a nasty chain of sub-queries (which surely will cause a stack overflow of some sort with 300 clauses) or you need to have an IF-THEN like this:

// Kill any previously stored selections.
SQLObject.Execute("Delete From SelectedApts Where SessionKey=X");
// Start with your first *known* attr/value and fill a table with the results.
.
.
Logic to pick first known attr/value pair
.
.
SQLObject.Execute("Insert Into SelectedApts Select X as SessionKey, AptID From AttrValue Where AllowsPets=1");

// Now you have the widest set that meets your criteria. Time to whittle it down.
if (HasParking == 1)
  SQLObject.Execute("Delete From SelectedApts Where AptID not in (Select AptID From AttrValue Where AllowsChildren=1));
if (AllowsChildren == 0)
  SQLObject.Execute("Delete From SelectedApts Where AptID not in (Select AptID From AttrValue Where AllowsChildren=0));
.
.
.
// Perform 2-300 more queries to keep whittling down your set to the actual match.

Now, you may be able to optimize this a bit so you run fewer queries (a PIVOT, sets of subqueries or using the UNION operator) but the fact is that this gets VERY expensive compared to the single query that you can use (but have to build) using the other approaches.

Thus, this is a painful kind of problem no matter what approach you take - there really is no magic that helps you to avoid it. But, having been there before, I would absolutely recommend approach #1.

Update: If you are really focused on pulling straight criteria matches ("All Apartments That Have A, B and C") and don't need other queries (like "...Sum(AllowsPets), Sum(AllowsChildren)..." or "...(AllowsPets=1) OR (AllowsChildren=1)...") then I really like KM's answer the more I look at it. It is very clever and looks likely to be acceptably fast.

Mark Brittingham
thanks for your lengthy, helpful and well informed answer Mark. I have been following the eav/pivot trail all afternoon and the more I look into this the more I start to feel like adding 100 specific bit fields and 100 more if statements in my one query (we only have one search query) is the way to go
Tom DeMille
lol - boy, I just wish someone had told me that back in 2000! As I say, though, this is a kind of ugly problem. And whenever good developers are faced with an ugly problem, they cannot help but think that there has to be a better way. So your search is quite understandable. At this point, most of the internals are hidden away in various classes that I've written but the pain of going through it stays with me!
Mark Brittingham
@Mark Brittingham, what do you think about my answer? I'm certainly no fan of EAV table design, not in the least. I just thought I'd give it a shot.
KM
Tom - see KM's answer though. It is pretty ingenious and gets around my objections to an attribute/value pair. I still cannot use it as I need to report statistics across record sets and rather than finding matching sets to a specific criteria set. However, it may work for you. Of course, my advice above may still hold - you might still find it simplest to create a table with many fields. At least SO has given you a lot to think about!
Mark Brittingham
KM - see my comments in your answer. I just wish that I'd come up with it! As I wrote Tom, though, I'm not specifically after "Objects that have Attribute X" in my project. So...I'll cut myself just enough slack to say that I was not tuned in to that specific application. Still, I admire what you've done from a purely technical perspective.
Mark Brittingham
@Mark, I think you ignore the potential in using views to access this data in combination with a pivot table. Also, you don't need a chain of IF statements. You can certainly go that route, but it's not necessary.
Chris Lively
Chris, are you saying that if you lay the table out with a pivot you don't need a chain of IF statements to query it? How would you do this, then, if you want to leave out unspecified criteria? That is, if you really only want to search on three criteria, I'd think that you'd dynamically build the query (using a chain of IF statements in C#, for example, or some TSQL) rather than arbitrarily creating a long query with many attributes that don't apply (e.g. Where...((@Param="NA") or (AllowsChildren=@Param))...). I may not quite understand why you think the Pivot takes care of the IF?
Mark Brittingham
Chris, still thinking about your question...if you are saying that you don't need the If statement I show above under the discussion of an EAV, you are right (you don't need to whittle the set down by hand - the Pivot/View takes care of creating a record-oriented view). However, you are THEN back at the need to query the table. THAT is where you'll then need the IF - to create the query on the view.
Mark Brittingham
@Mark, the "Where...((@Param is null) or (AllowsChildren=@Param))...)" is actually quite effective and easy to append to without needing IF statements. Further, it can be put into a stored proc if you so desire. Also, I'm actually thinking CROSS TAB, not pivot (my mistake in previous comment). The pivot construct is too dense and under performing compared to cross tab. The cross tab view would be generated at the time the attribute list changed. Which would allow a search screen to be dynamically generated based on the client settings.
Chris Lively
Chris...yes, I like the fact that you can test the parameter to take it out of the query although I can't help but think that it'd be more efficient not to have it. Since it doesn't enter into the actual search maybe it is no big deal - sounds like some performance testing time to me!
Mark Brittingham
Chris (continued) Anyway, I have to admit to being unfamiliar with Cross Tabs in SQL Server. Do you have an example? Maybe enter a new answer on this thread to illustrate how it would apply. I don't really care about votes one way or another (hell, I'll vote for your answer even if it doesn't really apply just to see you illustrate the idea). I just enjoy the back and forth on SQL.
Mark Brittingham
so as to being acceptably fast, it really isn't, we can't go backwards in speed.Anyone know if using CLR .dll based queries is faster than a normal sp?
Tom DeMille
+8  A: 

something like this may work for you:

define tables:

CREATE TABLE #Apartments
(
     ApartmentID    int          not null primary key identity(1,1)
    ,ApartmentName  varchar(500) not null
    ,Status         char(1)      not null default ('A') 
    --....
)

CREATE TABLE #AttributeTypes
(
    AttributeType         smallint     not null primary key
    ,AttributeDescription varchar(500) not null
)

CREATE TABLE #Attributes  --boolean attributes, if row exists apartment has this attribute 
(
     ApartmentID     int not null --FK to Apartments.ApartmentID    
    ,AttributeID     int not null primary key identity(1,1)
    ,AttributeType   smallint  not null --fk to AttributeTypes
)

insert sample data:

SET NO COUNT ON
INSERT INTO #Apartments VALUES ('one','A')
INSERT INTO #Apartments VALUES ('two','A')
INSERT INTO #Apartments VALUES ('three','I')
INSERT INTO #Apartments VALUES ('four','I')

INSERT INTO #AttributeTypes VALUES (1,'dishwasher')
INSERT INTO #AttributeTypes VALUES (2,'deck')
INSERT INTO #AttributeTypes VALUES (3,'pool')
INSERT INTO #AttributeTypes VALUES (4,'pets allowed')
INSERT INTO #AttributeTypes VALUES (5,'washer/dryer')
INSERT INTO #AttributeTypes VALUES (6,'Pets Alowed')
INSERT INTO #AttributeTypes VALUES (7,'No Pets')

INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (1,1)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (1,2)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (1,3)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (1,4)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (1,5)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (1,6)

INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (2,1)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (2,2)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (2,3)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (2,4)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (2,7)

INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (3,1)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (3,2)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (3,3)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (3,4)

INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (4,1)
INSERT INTO #Attributes (ApartmentID, AttributeType) VALUES (4,2)
SET NOCOUNT OFF

sample search query:

;WITH GetMatchingAttributes AS
(
SELECT
    ApartmentID,COUNT(AttributeID) AS CountOfMatches
    FROM #Attributes
    WHERE AttributeType IN (1,2,3)  --<<change dynamically or split a CSV string and join in
    GROUP BY ApartmentID
    HAVING COUNT(AttributeID)=3--<<change dynamically or split a CSV string and use COUNT(*) from resulting table
)
SELECT
    a.*
    FROM #Apartments                      a
        INNER JOIN GetMatchingAttributes m ON a.ApartmentID=m.ApartmentID
    WHERE a.Status='A'
    ORDER BY m.CountOfMatches DESC

OUTPUT:

ApartmentID ApartmentName 
----------- --------------
1           one           
2           two           

(2 row(s) affected)

In the search query above, I just included a CSV string of atribute IDs to search for. In reality, you could create a Search stored procedure where you pass in a CSV parameter containing the IDs to search on. You can look at this answer to learn about loop free splitting of that CSV strings into table which you can join to. This would result in not needing to use any dynamic SQL.

EDIT based on the many comments:

if you add a few columns to the #AttributeTypes table you could dynamically build the search page. Here are a few suggestions:

  • Status: "A"ctive "I"nactive
  • ListOrder: can use this to sort by to build the screen
  • ColumnNumber: can help organize fields on the same screen row
  • AttributeGroupID: to group fields, see below
  • etc.

You could make all the fields checkboxes, or add another table called #AttributesGroups, and group some together and use radio buttons. For example, since "Pets Allowed" and "No Pets" are exclusive, add a row in the #AttributesGroups table "Pets". The application would group the attributes in the interface. Attributes in Groups would work the same as regular ungrouped attributes, just collect the selected IDs and pass it in to the search procedure. However, for each group you'll need to have the application include a "no preference" radio button and default it on. This option will not have an attribute ID and is not passed in, since you don't want to consider the attribute.

In my example, I do show an example of a "super attribute" that is in the #Apartments table, "Status". You should only consider major attributes for this table. If you start using these, you may want to alter the CTE to be FROM #Apartments with filtering on these fields and then join to #Attributes. However you will run into issues of Dynamic Search Conditions, so read this article by Erland Sommarskog.

EDIT on latest comments:

here is code to have a list of exclude attributes:

;WITH GetMatchingAttributes AS
(
SELECT
    ApartmentID,COUNT(AttributeID) AS CountOfMatches
    FROM #Attributes
    WHERE AttributeType IN (1,2,3)  --<<change dynamically or split an include CSV string and join in
    GROUP BY ApartmentID
    HAVING COUNT(AttributeID)=3--<<change dynamically or split a CSV string and use COUNT(*) from resulting include table
)
, SomeRemoved AS
(
SELECT
    m.ApartmentID
    FROM GetMatchingAttributes      m
        LEFT OUTER JOIN #Attributes a ON m.ApartmentID=a.ApartmentID 
            AND a.AttributeType IN (5,6)   --<<change dynamically or split an exclude CSV string and join in
    WHERE a.ApartmentID IS NULL
)
SELECT
    a.*
    FROM #Apartments           a
        INNER JOIN SomeRemoved m ON a.ApartmentID=m.ApartmentID
    WHERE a.Status='A'

I don't think I would go this way though. I'd go with the approach I outlined in my previous EDIT above. When include/exclude of an attribute is necessary, I'd just add an attribute for each: "Pets allowed" and "No Pets".

I updated the sample data from the original post to show this.

Run the original query with:

  • (..,..,6,..) to find apartments that allow pets
  • (..,..,7,..) to find apartments where no pets are allowed
  • (..,..,..) if there is no preference.

I think this is the better approach. When combined with the grouping idea and dynamically built search page described in the last edit, I think this would be better and would run faster.

KM
+1 - This is very ingenious. I hadn't thought of doing the count across matches and then finding those with all X matches. It won't work for my problem but it may for Tom. One caveat though - it seems to me that you'll have to store both Attributes and "Not-Attributes" if you want to handle a case where some attribute will rule out an apartment (e.g. someone who does NOT want a place where pets are allowed). Or you'll have to change the GetMatchingAttributes so that both IN and NOT IN clauses are included. Good work, though!
Mark Brittingham
Also, TOM might want to consider having core attributes (those often requested) in his apartments table and then use the Attributes table just for the more obscure items to optimize performance.
Mark Brittingham
this looks awesome, thanks for the input, I'll post more after I check it out
Tom DeMille
this is awesome! I think the loop free splitting function from the linked in post is perhaps the most useful SQL utility I have ever seen! Thanks again!
Tom DeMille
I just wanted to throw one thing in here about this. SQL 2008 supports table-value parameters. This means you can pass the entire search criteria in as a table, and simply do a join on that table to accomplish the same goal. Which is faster because it doesn't rely on a parsing function call to work.
Chris Lively
@Chris Lively, table variables are a good option if on SQL Server 2008 and using a client application language that supports passing them in. However, OP never mentions which they are using.
KM
How could I change the query to exclude attribute as well?something likeWHERE AttributeType IN (select * from dbo.FN_ListToTable(',',@i)) and AttributeType Not IN (select * from dbo.FN_ListToTable(',',@o)) above doesn't seem to work though, the result set gets larger when adding the 'not in' clause
Tom DeMille
I have been doing some testing on the speed of this approach. As slick as this approach is, it seems to be a lot slower than just building a select statement and executing it dynamically with all the rows in one table... a lot slower actually. Using the same query and running it in a loop 200 times, the 'EXEC(@SQL)' method returned the count of listings (402 in this case) 200 times in 4 seconds, the attribute based approach returne the same 200 result count in 28 seconds, so in my limited testing the old school dynamic sql approach is 7 times faster. (indexes in place for both approaches)
Tom DeMille
update on this, by moving some 'super' attributes from the GetMatchingAttributes query to the apartments query am able to speed things up, still slower though than dynamic sql though by about 50%
Tom DeMille
@Tom, is that using the 1st query with only include attributes, or the second query using include+exclude attributes?
KM
KM - just the includes, first query
Tom DeMille
the funny thing is the query I compared it to does about 10 other joins as well, which I'm not doing in the attribute approach. I'm really suprised by the results
Tom DeMille
run SET SHOWPLAN_ALL ON and then this query, edit the question with the output in it, and the query you are running.
KM