views:

176

answers:

2

Here's my problem: a user searches for products by size. The result should show all products of the desired size (if any) plus products progressively larger and smaller until there are at least 50 undersized and 50 oversized products displayed in addition to the correctly-sized products.

The result should always show all products of a certain size; in other words, if moving to the next size up or down will result in more than 50 products, show them all - don't stop at 50.

Example: Imagine there are 25 distinct sizes with 20 products of each size. The user asks for size 12. We need to go three sizes down and three sizes up to get at least 50 in each direction. The query should return all size-12 products, plus the size 9, 10, 11, 13, 14, and 15 products. The query would return 140 products total (the 20 size-12 plus 60 above and 60 below.) Unfortunately the sizes are not nice integers like my example. They are arbitrary decimal values.

A Linq to SQL query to do this would be really cool, but plain SQL or C# is welcome, too. (My environment is C#, SQL Server 2005) Thanks

+1  A: 

Here's a sample SQL statement (for mysql) that should do what you want. But depending on what else your procedure is doing, you may find it faster to do some of the processing in the C# code:

SELECT 
    *
FROM
    products
WHERE 
    size = [[desired_size]] OR 
    size IN (
         SELECT DISTINCT 
             size 
         FROM 
             products
         WHERE
             size > [[desired_size]]
         ORDER BY 
             size
         LIMIT 50
    )
    OR
    size IN (
         SELECT DISTINCT 
             size 
         FROM 
             products
         WHERE
             size < [[desired_size]]
         ORDER BY 
             size DESC
         LIMIT 50
    )

I'll explain by starting at the beginning (and use your example for the values)...

Firstly we need to generate a list of the next 50 larger (or smaller) items. The following query should do this:

SELECT * FROM products WHERE size > 12 ORDER BY size LIMIT 50

So, right now, we're grabbing everything from the products table that's larger than the desired size. We order it by size, then limit it to only the first 50. So, in this case, it should return (in this order) 20 products of size 13, 20 products of size 14 and 10 products of size 15. You can try this in the Visual Studio SQL editor, and see which rows it returns.

But for our purposes, we only want a list of sizes, so we can limit the query further by changing the SELECT clause to:

SELECT DISTINCT size...

so, now we're only looking at the "size" column, and we use the DISTINCT keyword to avoid duplicate values

so now, the query returns just the list: (13, 14, 15)

We make a similar query to get the next 50 smaller items:

SELECT DISTINCT size FROM products WHERE size < 12 ORDER BY size DESC LIMIT 50

this is just the same as the above query, but we limit to only sizes that are smaller, and we reverse the ordering, so we get the 50 biggest items that are smaller than the desired size.

in this case this query will return the list (11, 10, 9)

If we put it all together in the outer query using these two lists, we get:

SELECT
  *
FROM
  products
WHERE
  size = 12 OR
  size IN (13, 14, 15) OR
  size IN (11, 10, 9)

So we pull all the products that have a size of 9 to 15

I hope this makes sense :-)

Chris Baxter
That's close but doesn't meet the spec. I've edited my original post to be clearer.
Sisiutl
A: 

A little late so you may have already solved this...

It seems the problem you're hitting is that you can't easily define the boundary conditions for your result set.

You can obviously easily do:

Get me everything of size X
Get me 50 of size >X
Get me 50 of size <X

However, since size is linear not discrete and we can't predict how many of each size exist, we can't group/count sizes which makes it difficult to determine which size will be at the outside of the +/-50 records - hence we need to read those values before we can get a final recordset. It may be possible to roll this into one with sub-queries but it wouldn't surprise me if LINQ executed in multiple requests,,,,

Something like:

Dim ExactSize = 1.1

Dim MaxSizeRecord = MyRepository.Get(function(x) x.Size > ExactSize).OrderBy(function(y) y.Size).Skip(50).First
Dim MinSizeRecord = MyRepository.Get(function(x) x.Size < ExactSize).OrderByDescending(function(y) y.Size).Skip(50).First

Dim FinalResults = MyRepository.Get(function(x) x.Size >= MinSizeRecord.Size and X.Size <= MaxSizeRecord.Size)

I haven't played around with the guts of LINQ enough to know if ...

Dim MaxSize = MyRepository.Get(function(x) x.Size > ExactSize).OrderBy(function(y) y.Size).Skip(50).First.Size
Dim MinSize = MyRepository.Get(function(x) x.Size < ExactSize).OrderByDescending(function(y) y.Size).Skip(50).First.Size

Dim FinalResults = MyRepository.Get(function(x) x.Size >= MinSize and X.Size <= MaxSize)

(ie getting the exact size of the boundaries rather than a record whose size happens to be at the boundary)

...would generate the same SQL - It's certainly more readable but LINQ may potentially execute the Min/Max queries immediately as the value is being stored in a Double - It's possible that by defining the type of MinSize/MaxSize implicitly it would treat it as an IQueryable(of double).

Using records instead of doubles whilst being slightly less readable should guarantee it's treated as a query not a value.

Basiclife