views:

46

answers:

1

Hi,

I'm using Google Appengine to store a list of favorites, linking a Facebook UserID to one or more IDs from Bing. I need function calls returning the number of users who have favorited an item, and the number of times an item has been favorited (and by whom).

My question is, should I resolve this relationship into two tables for efficiency? If I have a table with columns for Facebook ID and Bing ID, I can easily use select queries for both of the functions above, however this will require that each row is searched in each query. The alternative is having two tables, one for each Facebook user's favorites and the other for each Bing item's favorited users, and using transactions to keep them in sync. The two tables option has the advantage of being able to use JSON or CSV in the database so that only one row needs to be fetched, and little manipulation needs to be done for an API.

Which option is better, in terms of efficiency and minimising cost?

Thanks, Matt

A: 

I don't think there's a hard and fast answer to questions like this. "Is this optimization worth it" always depends on many variables such as, is the lack of optimization actually a problem to start with? How much of a problem is it? What's the cost in terms of extra time and effort and risk of bugs of a more complex optimized implementation, relative to the benefits? What might be the extra costs of implementing the optimization later, such as data migration to a new schema?

Simon Hibbs
Yeah I've been thinking about that. While it's maybe more efficient to do it in two tables, it is certainly simpler to just keep the one table, and then things like debugging and migration are easier. After all, I can easily generate the two tables later on if efficiency is an issue. Thanks :)
Matt