I have a table where I'm storing Lat/Long coordinates, and I want to make a query where I want to get all the records that are within a distance of a certain point.
This table has about 10 million records, and there's an index over the Lat/Long fields
This does not need to be precise. Among other things, I'm considering that 1 degree Long == 1 degree Lat, which I know is not true, but the ellipse I'm getting is good enough for this purpose.
For my examples below, let's say the point in question is [40, 140], and my radius, in degrees, is 2 degrees.
I've tried this 2 ways:
1) I created a UDF to calculate the Square of the Distance between 2 points, and I'm running that UDF in a query.
SELECT Lat, Long FROM Table
WHERE (Lat BETWEEN 38 AND 42)
AND (Long BETWEEN 138 AND 142)
AND dbo.SquareDistance(Lat, Long, 40, 140) < 4
I'm filtering by a square first, to speed up the query and let SQL use the index, and then refining that to match only the records that fall within the circle with my UDF.
2) Run the query to get the square (same as before, but without the last line), feed ALL those records to my ASP.Net code, and calculate the circle in the ASP.Net side (same idea, calculate the square of the distance to save the Sqrt call, and compare to the square of my radius).
To my suprise, calculating the circle in the .Net side is about 10 times faster than using the UDF, which leads me to believe that I'm doing something horribly wrong with that UDF...
This is the code I'm using:
CREATE FUNCTION [dbo].[SquareDistance]
(@Lat1 float, @Long1 float, @Lat2 float, @Long2 float)
RETURNS float
AS
BEGIN
-- Declare the return variable here
DECLARE @Result float
DECLARE @LatDiff float, @LongDiff float
SELECT @LatDiff = @Lat1 - @Lat2
SELECT @LongDiff = @Long1 - @Long2
SELECT @Result = (@LatDiff * @LatDiff) + (@LongDiff * @LongDiff)
-- Return the result of the function
RETURN @Result
END
Am I missing something here?
Shouldn't using a UDF within SQL Server be much faster than feeding about 25% more records than necessary to .Net, with the overhead of the DataReader, the communication between processes and whatnot?
Is there something I'm doing horribly wrong in that UDF that makes it run slow?
Is there any way to improve it?
Thank you very much!