Superfeedr is a feed-parsing on demand service. We want to provide analytics to our users and we're investigating waht would be the best strategy to do so.
In a nutshell, we want to track the number of operations (events, like : new entry in a given feed) in our system as well as agregated data (number of subscriber for feed).
Of course, the agregated data can be "computed" based on the the events. (the number of susbcribers to a feed is the sum of subscriptions, minus the sum of unsubscriptions). Yet, since we want to study that over time (number of susbcribers on a daily basis), the evented approach may be sub-optimal, since we would re-compute the same thing over and over.
How would you build such a component in your app? What information flow? What data-stores? What graphing solution? etc...
I know this is quite an open question, but I am sure we're not the first ones with such a need!
[UPDATE]: Infrastructure : We have a set of workers, that are XMPP clients and interact all together. They are based out of EventMachine, which means that they do not block on IO. Desired target : we must be able to collect massive amounts of data. Currently, we are already at about 200-300 msg/sec and we aim at 10x-100x that.