For the 30x speedup it offers!
I've been playing around with MySQL Spatial Extensions this morning, attempting to optimize some zip code search features. I was surprised to find the difference between using MySQL spatial indexes and using regular arithmetic greater than/less than comparisons in a query.
With spatial indexes enabled on a table of approximately 80,000 zip codes:
SELECT * FROM zip_codes WHERE
MBRContains(GeomFromText('Polygon(
(38.1922003089928 -122.206571969555,
38.1922003089928 -121.283430030445,
38.9170856910072 -122.206571969555,
38.9170856910072 -121.283430030445,
38.1922003089928 -122.206571969555))'), pt);
...
198 rows in set (0.01 sec)
Without spatial indexes on the same table:
SELECT * FROM zip_codes WHERE
((latitude > 38.1922003089928 AND
longitude > -122.206571969555 AND
latitude < 38.9170856910072 AND
longitude < -121.283430030445 ));
...
198 rows in set (0.33 sec)
Although my ZipCodeSearch Plugin uses the slower of these two methods in order to be database-agnostic, you can easily convert it to use MySQL's spatial indexing. Here's how:
- First, open a new migration. [The code in the following three steps should be placed within this migration.]
- Then, make sure your engine is of a type that supports the spatial extensions:
execute("ALTER TABLE zip_codes ENGINE=MyISAM;") - Add the column that will hold your lat/lon POINTs and populate it:
execute("ALTER TABLE zip_codes ADD pt POINT;") ZipCode.find_all.each do |z| lat = z.latitude lon = z.longitude execute("UPDATE zip_codes SET zip_codes.pt = (GeomFromText('POINT(#{lat} #{lon})')) WHERE zip_codes.id = #{z.id};"); end - Add the index. Note that the column has to be non-null in order for the index to be created:
execute("ALTER TABLE zip_codes MODIFY pt Point NOT NULL;") execute("ALTER TABLE zip_codes ADD SPATIAL INDEX(pt);") - Finally, modify the code in that calls "find_objects_within_radius" to pass in a finder_block that uses the spatial extension functions:
@zip_code.find_objects_within_radius(radius.to_i) do |min_lat, min_lon, max_lat, max_lon| p1 = "#{max_lat} #{max_lon}" p2 = "#{max_lat} #{min_lon}" p3 = "#{min_lat} #{min_lon}" p4 = "#{min_lat} #{max_lon}" sql = %{SELECT * FROM zip_codes WHERE \ MBRContains( \ GeomFromText('Polygon((#{p1},#{p2},\ #{p3},#{p4},#{p1}))'),pt);} zips_to_search_for = ZipCode.find_by_sql(sql) end
Some things to note about the query in the last step:
- The Polygon must be closed (see how p1 is the first and last point in the Polygon definition?).
- The Polygon is comprised a set of LineStrings. That is why there are two parentheses wrapping the set of points in our Polygon definition. If you wrapped those points with just one set of parentheses, this query would fail.
There you have it. Now go enjoy your spatial indexes!
Leave a Reply