views:

498

answers:

2

I have a mysql database table that will store locations for buildings and events and a few other things. All locations are stored in one table, and linked to buildings, events etc through their own many to many table. That way I can just display dots on a map, and also allow filtering etc.

However the problem comes with some things having a single location so 1 lat,long but some like a track has a number of lat long positions, and something like a large stadium might have a polygon over it. These are also stored as a list of lat,longs with the first and last being the same.

Im wondering how I should store this in the mysql db though. Originally I just had a column for lat, long and id for the lookup table. Should I have ANOTHER lookup table for the co-ordinates or serialise the data before putting it into the DB in some way or should I just store the whole string in one field lat1,long1 lat1,long1;lat2,long2;lat1,long1

Any suggestions?

+1  A: 

Since you're not doing any lookups on the locations, and you're using (I'm assuming) Google Maps API, the simplest solution would probably be to encode a list of lat/lon as JSON and store in a varchar column.

You can just output the JSON straight from the database for your Google Maps API code to use. I would suggest you to use some simple JSON structure like so: ["point",1.23456,2.34567] or ["line",1.23456,2.34567,3.45678,4.56789] and so on.

Blixt
Assuming you are not doing any queries on the (polygon near x etc), this is definitely the best approach.
Cannonade
Thanks for your answer and feedback to the other answer. The data will be used mainly for displaying on google maps. We may wish to cluster but I cant see us needing to query for locations. It will be show all or nothing!
Paul M
Thanks for your help and insight Blixt really helped I have used this approach on another project I am working on. But Alex answers my question for the example I gave so I am awarding him the right answer.
Paul M
+2  A: 

I wouldn't de-normalize the data from the start, by pushing a whole "serialized" polygon into a single field.

Rather, I'd have a Polygons table (with polygon ID and possibly auxiliary per-polygon info, such as whether it's an actual closed polygon or just a polyline -- though that might alternatively be represented in the following table by having the last point equal to the first one for a certain polygon), and a PointsInPolygon table (with coordinates of the point, polygon ID foreign key, vertex number within polygon -- the latter two jointly being unique).

Normalization will make (as usual) your life much simpler for ad-hoc queries (including in this case "polygons near X", point in polygon, etc). Again as usual, you can later add redundant denormalized values if and when you determine that some specific query really needs to get optimized (at some cost to table updates, integrity checks, etc). Geodata are not all that different from other kinds in this regard.

Alex Martelli
While that is the "proper, allow anything" approach, I believe there is such a thing as over-normalizing. He appears to want the additional location data to visualize locations on the client, not to query for nearby locations or similar. Even if he did want to query for locations, I would still recommend him to store polygons as serialized data, with a single latitude/longitude pair for geo-queries.
Blixt
Alex thanks for your answer. I need to sit down and decide if we are going to be querying for nearest location etc or if Blixt solutions is best for us. Choices choices!
Paul M
@Paul M, yeah, it IS hard -- a normal-form DB lets you delay the decisions and choices, but denormalizing can sure offer performance advantages. I live by Knuth's (and Hoare's) "premature optimization is the root of all evil in programming" - I only optimize once I have hard data showing I _need_ to optimize. @Blixt, I respect your pragmatism, but allow that my stance's pragmatic too: too often in 30 years of my career I've had "rock-solid specs" shifted out from under me -- so I keep my architecture as flexible as I can afford it to be, and normalization==max flexibility!-)
Alex Martelli
This is the option I went for so I could have max flexibility this app is going to be huge so we want it to scale.
Paul M
@Alex, I respect your experience and I trust that you're right. But from my own experience I've found that over-normalizing cases have made certain things that should be easy much harder due to an atomic object suddenly being a composition of several related objects. Could you refer to some guidelines that help one to decide where to draw the line?
Blixt
@Blixt, my stance is that I can always add denormalized data later to optimize queries that are _proven_ by profiling to need such optimization (sure, that means redundancy, but then that's almost synonymous with "denormalization"!-). Scott Ambler's "Agile database techniques" book has a short but useful section on denormalizing for optimization (you can read most of that section on google book search, but the book's well worth buying for other stuff too;-).
Alex Martelli