Quick intro
I've built a system which requests stats from social network apis for 1000s of different subjects every 20mins. So I do a call to each social network for each subject. This means im making 1000s of http requests for each 20 mins slot. The results are then processed in a separate task.
Current solution
I'm running php from the command line being invoked periodically from supervisor. Data is then saved to Mysql.
Lots of issues!
As php can't multi-thread or utilise asynchronous http requests, the api scripts are taking a long time to fetch the data from the social networks one connection at a time.
As my data model for the 'subjects' gets more complicated I may need to start joining tables and also need to have multiple servers.
Future
More and more subjects to be added, analysis tools with lots of number crunching.
I would be really interested to hear what other people are using with this kind of domain. E.g. platform / language / libraries / database / daemon tools etc
John