views:

56

answers:

2

Hi folks, I have (what I think) is a simple Sql Server spatial query:

Grab all the USA States that exist inside some 4 sided polygon (ie. the viewport/bounding box of a web page's google/bing map)

SELECT CAST(2 AS TINYINT) AS LocationType, a.Name AS FullName, 
    StateId, a.Name, Boundary.STAsText() AS Boundary, 
    CentrePoint.STAsText() AS CentrePoint
FROM [dbo].[States] a
WHERE @BoundingBox.STIntersects(a.Boundary) = 1

It takes 6 seconds to run :(

Here's the execution plan.... alt text

And the stats on the Filter operation... alt text

Now, I'm just not sure how to debug this .. to figure out what I need to fine tune, etc. Do I have any spatial indexes? I believe so ...

/****** Object:  Index [SPATIAL_States_Boundary]    
        Script Date: 07/28/2010 18:03:17 ******/
CREATE SPATIAL INDEX [SPATIAL_States_Boundary] ON [dbo].[States] 
(
    [Boundary]
)USING  GEOGRAPHY_GRID 
WITH (
    GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH), 
    CELLS_PER_OBJECT = 1024, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, 
    DROP_EXISTING = OFF, ALLOW_ROW_LOCKS  = ON, 
    ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Do I need to provide some more information on the GEOGRAPHY data which is returned? eg. number of points, etc? Or do I need to run profiler and give some stats from there?

Or are my Cells_per_object / Grids set incorrectly ( I really have no idea what I should be setting those values to, TBH).

Can anyone please help? Please?

UPDATE/EDIT:

After the first reply from @Bobs below confirming that the spatial index was not getting used because the Primary Key (clustered Index) would be faster than a non-clustered index on a table with 50 odd rows ... I then tried to force the Spatial Index (for shits-n-giggles) :-

SELECT CAST(2 AS TINYINT) AS LocationType, a.Name AS FullName, 
    StateId, a.Name, Boundary.STAsText() AS Boundary, 
    CentrePoint.STAsText() AS CentrePoint
FROM [dbo].[States] a WITH (INDEX(SPATIAL_States_Boundary))
WHERE @BoundingBox.STIntersects(a.Boundary) = 1

... and guess what .. the query runs instantly.

WTF? Anyone else know why? Do I need to post a query plan for that, also, to help explain why/what?

+1  A: 

It appears that you have an optimal plan for running the query. It’s going to be tough to improve on that. Here are some observations.

The query is doing a Clustered Index Scan on the PK_States index. It’s not using the spatial index. This is because the query optimizer thinks it will be better to use the clustered index instead of any other index. Why? Probably because there are few rows in the States table (50 plus maybe a few others for Washington, D.C., Puerto Rico, etc.).

SQL Server stores and retrieves data on 8KB pages. The row size (see Estimate Row Size) for the filter operation is 8052 bytes, which means there is one row per page and about 50 pages in the entire table. The query plan estimates that it will process about 18 of those rows (See Estimated Number of Rows). This is not a significant number of rows to process. My explanation doesn’t address extra pages that are part of the table, but the point is that the number is around 50 and not 50,000 pages.

So, back to why it uses the PK_States index instead of the SPATIAL_States_Boundry index. The clustered index, by definition, contains the actual data for the table. A non-clustered index points to the page where the data exists, so there are more pages to retrieve. So, the non-clustered index becomes useful only when there are larger amounts of data.

There may be things you can do to reduce the number of pages processes (e.g., use a covering index), but your current query is already well optimized and you won’t see much performance improvement.

bobs
@Bobs Cheers for the detailed reply. I really really really appreciated it :) A few more questions : if the query optimizer decides to use the Clustered Index - kewl. And you're right .. there's only around 56 or so rows in that table. So why would it take so long? is there something else I can see where throughput is getting hit :( If it's not the query .. what else can it be? (the server is not running at 100%, btw).
Pure.Krome
@Bobs - i've also updated the initial post with some more info .. down the bottom under EDIT/UPDATE. Can you please have a re-read?
Pure.Krome
That's very interesting that the spatial index works so well. I suspect that the performance to evaluate the spatial condition with the clustered index is not as fast as using the spatial index.For testing purposes, you might try an non-clustered index on a non-spatial column and run a query that uses this column. Then, see how the query compares to your previous queries. It would be interesting to see how a query without spatial logic performs.
bobs
+1  A: 

Try this, w/out the index hint:

EXEC sp_executesql N'
  SELECT CAST(2 AS TINYINT) AS LocationType, a.Name AS FullName, 
      StateId, a.Name, Boundary.STAsText() AS Boundary, 
      CentrePoint.STAsText() AS CentrePoint
  FROM [dbo].[States] a
  WHERE @BoundingBox.STIntersects(a.Boundary) = 1'
, N'@BoundingBox GEOGRAPHY', @BoundingBox

If that makes any difference, see here for more details:

If you're running the code in SSMS, use sp_executesql around the spatial query (or use your own stored procedure with the spatial value as a parameter) to ensure the query coster "knows" the parameter value at the time its creating the query plan, that is, at beginning of the batch or on entry to a stored procedure or sp_executesql.

Peter
@Peter interesting ... So i moved it into a stored proc, with the boundingBox variable getting passed in. It's now 9 seconds with no INDEX getting used. This is on SQL 2008 R2 / 10.50 RTM.
Pure.Krome
@Peter also, running it (as above) in SSMS returns in 9 seconds with no Index getting used and putting that code above, into a stored proc with the BoundingBox variable getting passed in does the same, again. :(
Pure.Krome
That's pretty funny. You might also play around w./ `.Filter()`, if you haven't already. I often find myself materializing the results of `.Filter()` in a temp table, and then doing more precise operations off of that.
Peter
Also, I recommend cross-posting at http://social.msdn.microsoft.com/Forums/en-US/sqlspatial. The readership there is much better than SO for SQL Server spatial questions.
Peter

related questions