views:

24

answers:

2

Hi,

I've got a MySQL query that pulls lat longs from a database based on a criterion, tests whether these points are within a polygon, and returns the points that are within the polygon.

Everything works fine. The problem is that the query takes approx. 20 seconds to return a result. Is there a way to optimize this query so that query speed is faster?

SELECT latitude, longitude
FROM myTable
WHERE offense = 'green' AND myWithin(
POINTFROMTEXT( CONCAT( 'POINT(', latitude, ' ', longitude, ')' ) ) , POLYFROMTEXT( 'POLYGON(( ...bunch of lat longs...))' )
) = 1;

I ran an EXPLAIN SELECT... which produced

id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra

1 SIMPLE myTable ALL NULL NULL NULL NULL 137003 Using where

Is there a way to optimize a query that is run on every latitude and longitude in the db or is this as good as it gets?

I'm thinking about doing a select into another table and then querying the results table, but I was hoping that there would be a way to improve the performance of this query.

If anyone has any suggestions or ideas, I'd love to hear them.

Thanks,

Laxmidi

A: 

I see two obvious avenues for optimization:

  • Reduce the result set more before you run your function O(n) times. Right now you're running the function 137003 times - there's little way to avoid that if you can't filter the result set any further.

  • Make the function faster, such that you're still running it 137k times, but each invocation takes less time, thus reducing your total runtime.

Right now your function is taking 0.1459 milliseconds per row to run, which really isn't bad. You probably want to try to find some way to further reduce the number of rows you have to run it on. Reducing the result set through clever use of WHERE also has the side benefit of allowing your database to do some optimization for you, which is how you want to be using it.

Nick Bastin
Hi Nick,Thank you for the message. I'm glad that my function is running reasonably fast. I'll have to work on reducing the number of times the function has to run. I really appreciate your help.
Laxmidi
+1  A: 

How big are the polygons? You could define a "bounding rectangle" around the whole polygon and then do:

SELECT latitude, longitude
FROM myTable
WHERE
  offense = 'green' AND
  latitude BETWEEN rect_left AND rect_right AND
  longitude BETWEEN rect_top AND rect_bottom AND
  myWithin(
    POINTFROMTEXT( CONCAT( 'POINT(', latitude, ' ', longitude, ')' ) ),
    POLYFROMTEXT( 'POLYGON(( ...bunch of lat longs...))' )) = 1;

That way, it could use an index on latitude and longitude to narrow down the number of points that it has to run the complex stuff on.

Dean Harding
Hi codeka,The polygons are neighborhoods. And they are irregularly shaped. Is there a way to run MBRContains and the myWithin functions? As I understand it, MBRContains gives false positives. Does it give false negatives? If not, maybe that would be a solution. If there is a way to use MBRContains in concert with the myWithin function, then I could avoid having to add columns for bounding rectangles for each neighborhood. Love to hear your thoughts. I think that the "bounding rectangle" is a good idea.
Laxmidi
You shouldn't have to add columns for the bounding rectangle, it's easily calculated. Just take your "bunch of lat longs" and calculate the minimum and maximum latitude - that's 'rect_left' and 'rect_right' respectively. The minimum and maximum longitude from "bunch of lat longs" is 'rect_bottom' and 'rect_top' respectively. You can simply calculate that before running your query.
Dean Harding
Hi Dean,Thanks for the message. Very good point. I didn't think of that. Problem solved. Thanks so much.
Laxmidi