views:

51

answers:

3

I'm trying to build a search engine for a website. It's mostly a collection of HTML/CSS pages with some PHP. Now that's all there is. All of my content in on the pages.

From what I understand to be able to do this I would need to have the content on a Database, am I correct?

If so I was considering doing as such, creating a MySQL table with four columns "Keywords" "Titles" "Content" and "Link".

  • Keywords - will hold the a word that if its in the query will show this as the most likely result.
  • Titles - after searching Keywords searches the titles produce the most relevant results
  • Content - should be a last resource for finding something as it will be messier I believe
  • Link - is just the link that belongs to the particular row.

I will be implementing it with PHP and MySQL, and it will be tiresome to put all the content, titles etc into a db. Is this a good method or should I be looking at something else?

Thanks.

---------------EDIT-------------------

Lucene seems like a good option, however even after reading the Getting started and looking around a bit on the web I cant understand how it works, can someone point me somewhere that explains this in a very very basic manner? Especially taking in consideration I do not know how to compile anything.

Thank you.

+1  A: 

Sphinxsearch is an open-source full-text search server, designed from the ground up with performance, relevance (aka search quality), and integration simplicity in mind.

Sphinx lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with Sphinx pretty much as a database server.

The MYYN
The only thing to point out is that, of course, it requires to be installed on the server (which he could or could not do, I do not know).
nico
+2  A: 

Building a search engine from scratch is painful. It is an interesting task, indeed, so if it is for learning, then do it!

However, if you just need a good search function for your web site, please use something that others have done for you. Apache Lucene is one option.

Johan
A: 

I'm assuming your pages are static HTML. You can do two things at once and transfer the content of the pages in the DB, so that they will be generated on the fly by reading their content from the DB.

Anyway, I think your strategy is OK at least for a basic search engine. Also have a look into MySQL fulltext search.

nico