I am thinking of developing A Web search Engine using Erlang, Mnesia & YAWS. Is it possible to make powerful & fastest WEB SEARCH ENGINE using these???? What will it need to accomplish this & HOW to start with???? If you have any suggestions, I'll be greatful to you.
views:
681answers:
4As far as I know Powerset's natural language procesing search engine is developed using erlang.
Did you look at couchdb (which is written in erlang as well) as a possible tool to help you to solve few problems on your way?
In the 'rdbms' contrib, there is an implementation of the Porter Stemming Algorithm. It was never integrated into 'rdbms', so it's basically just sitting out there. We have used it internally, and it worked quite well, at least for datasets that weren't huge (I haven't tested it on huge data volumes).
The relevant modules are:
rdbms_wsearch.erl
rdbms_wsearch_idx.erl
rdbms_wsearch_porter.erl
Then there is, of course, the Disco Map-Reduce framework.
Whether or not you can make the fastest engine out there, I couldn't say. Is there a market for a faster search engine? I've never had problems with the speed of e.g. Google. But a search facility that increased my chances of finding good answers to my questions would interest me.
I would recommend CouchDB instead of Mnesia.
- Mnesia doesn't have Map-Reduce, CouchDB does (correction - see comments)
- Mnesia is statically typed, CouchDB is a document database (and pages are documents, i.e. a better fit to the information model in my opinion)
- Mnesia is primarily intended to be a memory-resident database
YAWS is pretty good. You should also consider MochiWeb.
You won't go wrong with Erlang