views:

68

answers:

1

IIRC, each Facebook user can have 5000 friends. The average is 130, but the maximum is much higher. Each of those friends can have 'liked' zero or more entities drawn from a set of millions. When e.g. looking at a subset of those entities, grouped by N axes (e.g. by category and size), how would you find those that friends have liked?

With GAE, the cost is compute time rather than data size. You can't, at search-time, find all entries by any friend in a given category and size. You could add an entry for a user when each friend performs an action, but that would mean up to 5000 data entries each time a friend does something. That is a lot of CPU time, even in the background. You'd also miss new friends trying the app out, who were missed in the initial add. It makes sense to try to segment the space but friends are linked in very hard-to-group ways.

Any ideas? Have you solved similar issues?

+1  A: 

Try using low-level datastore API, with the item ID as a key and "friends" IDs as property names (property value is true or false in this case). I think that lookup with a key filter must be quite fast.

Dmitry
There are a few issues with the method as it stands, but using the property-space is probably one part of a solution so I'll give you an upvote! Thanks, Dmitry.
Richard Watson