views:

123

answers:

3

I have a query that returns 10000's of records that are used as plot points on a map. In an effort to reduce load, and increase app speed we're trying to implement what basically amounts to Level of Detail logic. Basically, when zoomed out, display 50% of the points. When zoomed in, display 100% of the points.

This is ultimately what I need the final SQL to look like:

SELECT *
FROM 
    (SELECT 
     [t0].[RecordName], 
     [t0].[LastMaintenanceDate]
     ,Row_Number() OVER (ORDER BY [t0].Ticker) as RowNumber 
     FROM 
      [dbo].[TableName] AS [t0]
    )as [t1]
WHERE RowNumber % 2 = 0

In LINQ I can use .Skip and .Take to get the Row_Number() Over part (example here), but when doing this, the Where criteria generated uses 'between' instead of the "where RowNumber % 2 = 0" that I need.

Am I approaching this correctly? In order to gain the full performance gains we're looking for here, this exclusion really needs to happen on the SQL server.

A: 

This is a very interesting problem, and unfortunately I don't know the answer using pure LINQ to SQL.

The only thing I can suggest is that LINQ to SQL does support ad-hoc raw SQL queries. Your problem may be a valid case for doing this.

Randolpho
Yeah, going this route is really a last resort IMO. If I do go this route (I've never had to yet in the LINQ to SQL world), is there a perf difference between stringing together a query vs using a proc? I assume the proc route is ideal.
levi rosol
Yeah, stored procedure is probably better. With respect to LINQ to SQL, there's no difference, although the proc itself may buy you some performance. In truth, I should have said proc the first time, but I was so stymied by your question that I just jumped on the raw SQL option. :)
Randolpho
A: 

The morelinq project has a TakeEvery function that does this, but this only works on IEnumerable.

It should be possible to create an extension method to solve this (I do not have my development environment available at the moment, so I cannot test it right now).

In your specific example I am guessing an extension method to Table would suffice.

If you wish to try this there is an 'example' of making an extension method to a table here.

aanund
+2  A: 

What columns do you have in your table? If you for example have a int identity primary key column you could use that...

from mt in dc.MyTable
where mt.ID %2 == 0
select mt

...or...

where mt.SomeDataTime.Millisecond % 2 == 0

...that said, where are you trying to reduce load?

The T-SQL in your post, as well as the two solutions I have mentioned will all force full table scans, so if your table is large-ish then it would be better you can reduce records based on something indexed (and where the where clause predicate can actually use the index)...

KristoferA - Huagati.com
I do have an ID type column that I can use this logic on, so the "where mt.Ticker %2 == 0" worked perfectly for what I am trying to do. I apparently need to learn more about the % operator and how it works. I've never used it before until yesterday when i found it in the example link i posted. thanks again!!
levi rosol
btw, the comment about the full table scan is also helpful. Fortunately in this case, indexing the column in question should not be an issue.
levi rosol
This is a good idea, provided the results are not filtered before taking every other row. Add any sort of filter on this, however, and the distribution will change. You might not evenly skip rows, and it's unlikely (depending on the ultimate distribution created by the filter) that you'll cut your row volume in half. Just warning you. If you want to filter the results before skipping every other row, you'll need to use a stored procedure.
Randolpho
@Randolpho well, this is based on the assumption that the query in the question reflects the real-world query. :) Other than that, the millisecond should generally give a decent distribution on filtered results if there is a column like that. ... That said, I would probably try to use geographic distribution to filter out the most overlapping 50% if - as per the OP - this is coordinates that will be plotted on a map. So self-join, calculate distance, and skip the first 50% of the rows... but that's another story...
KristoferA - Huagati.com
You're right, any sort of filter prior to this could throw things off. Fortunately that is not he case here as each record is a data point, and a collection of points represent an overall area. Additionally, each point has an int for an id, and they are in order, so reducing by 2 or 3 or more does a great job of evenly reducing my data. users are much happier waiting for 1000 points to render vs 40,000...
levi rosol