I've been using the low level datastore API for App Engine in Java for a while now and I'm trying to figure out the best way to handle one to many relationships. Imagine a one to many relationship like "Any one student can have zero or more computers, but every computer is owned by exactly one student".
The two options are to:
- have the student entity store a list of Keys of the computers associated with the student
- have the computer entity store a single Key of the student who owns the computer
I have a feeling option two is better but I am curious what other people think.
The advantage of option one is that you can get all the 'manys' back without using a Query. One can ask the datastore for all entities using get() and passing in the stored list of keys. The problem with this approach is that you cannot have the datastore do any sorting of the values that get returned from get(). You must do the sorting yourself. Plus, you have to manage a list rather than a single Key.
Option two seems nice because there is no list to maintain. Also, you can sort by properties of the computer as long as their is an index for that property. Imagine trying to get all the computers for a student where the results are sorted by purchase date. With approach two it is a simple query, no sorting is done in our code (the datastore's index takes care of it)
Sorting is not really hard, but a little more time consuming (~O(nlogn) for a sort) than having a sorted index (~O(n) for going through the index). The tradeoff is an index (space in the datastore) for processing time. As I said my instinct tells me option two is a better general solution because it gives the developer a little more flexibility in getting results back in order at the cost of additional indexes (which with the google pricing model are pretty cheap). Does anyone agree, disagree, or have comments?