views:

64

answers:

2

Looking to use Sphinx for site search, but not all of my site is in mysql. Rather than reinvent the wheel, just wondering if there's an open source spider that easily tosses its findings into a mysql database so that Sphinx can then index it.

Thanks for any advice.

A: 

There's also the XML pipe datasource that can feed documents to Sphinx. Not sure if it'd be any easier to set something up to output your site's content as XML than it would be to insert it into the DB, but it's an option.

Ty W
A: 

If you're not 100% stuck on using Sphinx you could consider Lucerne like this site is? This should work regardless of underlying technology (database driven or static pages).

I am also currently looking to implement a site search. This question may also help.

Pool
Currently run the site (nginx and mysql) on the same server with 2 gig RAM. Would I be able to have Lucerne running on the same box at the same time?
Ian
Sorry, can't give a definite yes or no as I've not implemented this and it is dependent on other factors. Running the two side by side shouldn't be a problem though, I would presume it would all be down to number of concurrent users, etc. I'd check what resources you're already using with vmstat and top or equivalent
Pool