views:

22

answers:

1

I need a system that schedules and conducts the loading of a large number of Feeds. The scheduling should consider priority values for feeds provided by me and the history of past publish frequency of the feed. Later the system should make use of pubsub where available. Currently I'm planning to implement my own system based on HBase and ZooKeeper. If there isn't any free software solution by now, then I'd propose at work to develop our solution as Free Software.

A: 

I'm not sure that I completely understand your question. You need a system that fetches RSS feeds on a regular schedule, archives them, and then allows PubSub to index them? Instead of coding your own solution, why don't you just set up a Yahoo Pipe and then allow PubSub to index that? You wouldn't even need your own server.

Reinderien
I need the system to feed my own search engine. I doubt that Y!Pipes would allow me to query it at that scale and I also can't setup millions of pipes.
Thomas Koch