views:

154

answers:

2

I have thousands of photos on my site (each with a numeric PhotoID) and I have EXIF data (photos can have different EXIF tags as well).

I want to be able to store the data effectively and search it.

Some photos have more EXIF data than others, some have the same, so on..

Basically, I want to be able to query say 'Select all photos that have a GPS location' or 'All photos with a specific camera'

I can't use MySQL (it won't scale well with the massive data size). I thought about Cassandra, but I don't think it lets me query on fields. I looked at SimpleDB, but I would rather: not pay for the system, and I want to be able to run more advanced queries on the data.

Also, I use PHP and Linux, so it would be awesome if it could interface nicely to PHP.

Edit: I would prefer to stick with some form of NoSQL database.

Any ideas?

+1  A: 

I would probably personally stick to MySQL, but if you are looking for a NoSQL style system you might want to look into Solr. That allows things like faceted searches (e.g. tells you how many of your current search result fit into each resolution / format / etc and lets you narrow your search that way).

Zarigani
Solr is an interesting idea. I guess MySQL would work too, but I would rather have datatypes for certain keys (rather than just `varchar` for the key and value)
webdestroya
Was the data types comment aimed at MySql or Solr? If the former, then you need to do a little more [research](http://dev.mysql.com/doc/refman/5.0/en/data-types.html). If the latter, then I admit to not knowing the details of Solr, but since the [tutorial](http://lucene.apache.org/solr/tutorial.html#Faceted+Search) talks about range searches on price, I can only assume it does.
Zarigani
No, datatypes wont work in MySQL that way, Since i need (photoid,exifkey,exifvalue) i have to use a generic datatype for the value - otherwise i would have many null columns
webdestroya
I guess i will concede and try mysql, tho it would be nice to have a more typed system
webdestroya
OK, I understand more of what you are doing now I see how you are planning on storing the data. Out of interest, how many fields does exif define? The wikipedia article only talks about ~30 but didn't indicate if that was a complete list. If there is a complete well-defined list, I would be tempted to give each a column and just accept that you'll have lots of nulls. If the list isn't complete then yes, you are probably stuck with the key/value pair approach.
Zarigani
+2  A: 

I also doubt that MySql would have any load problems, but have a look at CouchDB:

Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API.

Gordon