What are provenly scalable data persistence solutions for consumer profiles?

Consumer profiles with analytical scores [ConsumerID, 1..n demographical variables, 1...n analytical scores e.g. "likely to churn" "likely to buy an item > 100$ in worth" etc.] have to be possible to query fast if they are to be used in customizing web-sites, consumer communications etc.

Well. If you have:

Large number of consumers
Large profiles with a huge set of variables (as profiles describing human behaviour are likely to be..)

...you are in trouble. If you really have a physical relational database to which you target a query and then a physical disk starts to rotate someplace to give you an individual profile or a set of profiles, the profile user (a web site customizing a page, a recommendation engine making a recommendation..) has died of boredom before getting any observable results.

There is the possibility of having the profiles in memory, which would of course increase the performance hugely. What are the most proven solutions for a fast-response, scalable consumer profile storage? Is there a shootout of these someplace?

Thanks for the comment, however: "working on this problem" == *proven solution* for a fast-response, scalable consumer profile storage?Just asking.

Hubbard 2010-06-02 11:25:56

Some of the Mahout contribs are responsible for giant, deployed, solutions.

bmargulies 2010-06-02 11:35:51

A meager attempt to improve the odds of hitting the queried profile(s) is of course the possibility of having some type of a "reel" - that is, you take a random sample off the disk constantly, holding a part in memory, and target the query to the DB only for the parts you don't have in memory. But this seems to me as a work-around, not a solution..

Hubbard 2010-06-02 11:40:51

ansaurus

tags:

views:

answers:

What are provenly scalable data persistence solutions for consumer profiles?

related questions