Hi,
I have a data model like the following:
username | product1 | product2
-------------------------------
harold abc qrs
harold abc def
harold def abc
kim abc def
kim lmn qrs
...
username | friend_username
---------------------------
john harold
john kim
...
I want to build a histogram of the most frequent product1 to product2 records there are, restricted to a given product1 id, and restricted only to friends of john. So something like:
What do friends of john link to for product1, when product1='abc': Select all of john's friends from the friends table. For each friend, count and group the number of records where product1 = 'abc', sort results in desc order:
Results:
abc -> def (2 instances)
abc -> qrs (1 instance)
I know we can do the following in a relational database, but there will be some threshold where this kind of query will start utilizing a lot of resources. Users might have a large number of friend records (500+). If this query is running 5 times every time a user loads a page, I'm worried I'll run out of resources quickly.
Is there some other table I can introduce to my model to relieve the overhead of doing the above query everytime users want to see the histogram break down? All I can think of is to precompute the histograms when possible so that reads optimized.
Thanks for any ideas