I am looking for a search engine that can regularly (daily-ish) scan about 100 pages for changes and index an associated site if changes since the last scan are found. It should be able to handle about 100 sites, each averaging 4000 pages of about 5k average size, each on a different server (but only the one centralized search engine). Each of these sites will have a search form that gets submitted to this search engine. The results that are returned must be specific to the site that submitted them. I create the templates for the external sites, so I can give the search form a hidden field that specifies which site the form is submitted from.
What would you recommend I look into?
I would love to use a Python-based system for this, if feasible.
I am currently using something called iSearch2. It doesn't seem very stable at this scale, the description of the product states it is not really intended to do multiple sites, is in PHP (which is less comfortable to me than Python), and has a few other shortcomings for my specific situation.