Sunday, January 10, 2010

Ugly queries

I've never been a fan of ORMs in general. I love the concept. I think it's awesome but since the early days of hibernate, I've always had problems with automatically generated SQL statements. It's not as bad as it used to be but sometimes you get "gems" in the wild.

Today's gem comes courtesy of the Geokit plugin for Rails. Our newest application at AJC uses Geokit for finding properties within a 10-mile radius of the entered address. We've got a table with about 500k rows of geocoded metro Atlanta property data in MySQL.

Enter the query:


SELECT
*,
(ACOS(least(1,COS(0.593807786352424)
*COS(-1.47255067854929)
*COS(RADIANS(parcels.latitude))
*COS(RADIANS(parcels.longitude))+
COS(0.593807786352424)
*SIN(-1.47255067854929)
*COS(RADIANS(parcels.latitude))
*SIN(RADIANS(parcels.longitude))+
SIN(0.593807786352424)*SIN(RADIANS(parcels.latitude))))*3963.19)
AS distance FROM `parcels`
WHERE (((parcels.latitude>32.5769814714641
AND parcels.latitude<35.4683785285359>-86.1150637195914
AND parcels.longitude<-82.6268142804086))
AND ((ACOS(least(1,COS(0.593807786352424)
*COS(-1.47255067854929)*COS(RADIANS(parcels.latitude))
*COS(RADIANS(parcels.longitude))+ COS(0.593807786352424)
*SIN(-1.47255067854929)
*COS(RADIANS(parcels.latitude))
*SIN(RADIANS(parcels.longitude))+
SIN(0.593807786352424)
*SIN(RADIANS(parcels.latitude))))*3963.19)<= 100))
ORDER BY distance asc
LIMIT 21;


This bad boy is a DOOZY. It's got all the makings of exploiting everything in MySQL that can't be optimized - function results for comparison, ordering by function results and even a LIMIT thrown in for good measure.

And guess what? There's currently not much that can be done to optimize this query. Indexes won't help. The query cache is useless because A) the queries are unique to each source address and B) it'll just get pushed out anyway.

I'm looking around for some tips on this and the most likely solution is using spatial indexes in MySQL but that's a whole other issue.

No comments: