views:

247

answers:

2

I need to analyze a few million geocoded records, each of which will have latitude and longitude. These records include data of at least three different types, and I will be trying to see if each set influences the other.

What database is best for the underlying data store for all this data? Here's my desires:

  • I'm familiar with the DBMS. I'm weakest with PostgreSQL, but I am willing to learn if everything else checks out.
  • It does well with GIS queries. Google searches suggest that PostgreSQL + PostGIS may be the strongest? At least a lot of products seem to use it. MySql's Spatial Extensions seem comparatively minimal?
  • Low cost. Despite the 10GB DB limit in SQL Server Express 2008 R2, I'm not sure I want to live with this and other limitations of the free version.
  • Not antagonistic with Microsoft .NET Framework. Thanks to Connector/Net 6.3.4, MySql works well C# and .NET Framework 4 programs. It fully supports .NET 4's Entity Framework. I cannot find any noncommercial PostgreSQL equivalent, although I'm not opposed to paying $180 for Devart's dotConnect for PostgreSQL Professional Edition.
  • Compatible with R. It appears all 3 of these can talk with R using ODBC, so may not be an issue.

I've already done some development using MySql, but I can change if necessary.

+2  A: 

PostGis definitely. Here's why.

  1. Postgres is far superior to MySQL in performance. Server is more fault tolerant, has out of the box tools for load-balancing, caching and optimization.
  2. PostGIS is becoming a standard in GIS apps.
  3. It's free.
dekomote
Thank you. And of those points, #2 is especially strong for me.
Aren Cambre
+2  A: 

If you are interested in a thorough comparison, I recommend "Cross Compare SQL Server 2008 Spatial, PostgreSQL/PostGIS 1.3-1.4, MySQL 5-6" and/or "Compare SQL Server 2008 R2, Oracle 11G R2, PostgreSQL/PostGIS 1.5 Spatial Features" by Boston GIS.

Considering your points:

  • I'm familiar with the DBMS: setting up a PostGIS database on Windows is easy, using PgAdmin3 management is straight-forward too
  • It does well with GIS queries: PostGIS is definitely strongest of the three, only Oracle Spatial would be comparable but is disqualified if you consider it's costs
  • Low cost: +1 for PostGIS for sure
  • Not antagonistic with Microsoft .NET Framework: You should at least be able to connect via ODBC (see Postgres wiki)
  • Compatible with R: shouldn't be a problem with any of the three
underdark
Heh - Oracle Spatial was a $1 million dollar license, last I heard
OMG Ponies
Thank you. The 2nd comparo link is helpful. I only found the first one earlier because I had MySql in my search terms. So looks like it's PostgreSQL for me!
Aren Cambre