I have a decent, lightweight search engine working for one of my sites using MySQL fulltext indexes and php to parse the results. Work fine but I'd like to offer more 'google-like' results with text snippets from the results and the found words highlighted. Looking for a php based solution. Any recommendations?
views:
575answers:
4use preg_replace()
(or similar function) and replace your search string with highlighted text. e.g.
$highlighted_text = preg_replace("/$search/", "<span class='highlighted'>$search</span>", $full_text);
For MySQL, your best bet would be to first split up your query words, clean up your values, and then concatenate everything back into a nice regular expression.
In order to highlight your results, you can use the <strong>
tag. Its usage would be semantic as you are putting strong emphasis on an item.
// Done ONCE per page load:
$search = "Hello World"
//Remove the quotes and stop words
$search = str_ireplace(array('"', 'and', 'or'), array('', '', ''), $search);
// Get the words array
$words = explode(' ', $search);
// Clean the array, remove duplicates, etc.
function remove_empty_values($value) { return trim($value) != ''; }
function regex_escape(&$value) { $value = preg_quote($value, '/'); }
$words = array_filter($words, 'remove_empty_values');
$words = array_unique($words);
array_walk($words, 'regex_escape');
$regex = '/(' . implode('|', $words) . ')/gi';
// Done FOR EACH result
$result = "Something something hello there yes world fun nice";
$highlighted = preg_replace($regex, '<strong>$0</strong>', $result);
If you are using PostgreSQL, you can simply use the built-in ts_headline
as described in the documentation.
Searching the actual database is fine until you want to add snazzy features like the one above. In my experience it is best to create a dedicated search table, with keywords and page IDs/URLs/etc. Then populate this table every n hours with content. During this population you can add snippets for each document for each keyword.
Alternatively a quick hack might be:
<?php
$text = 'This is an example text page with content. It could be red, green or blue.';
$keyword = 'red';
$size = 5; // size of snippet either side of keyword
$snippet = '...'.substr($text, strpos($text, $keyword) - $size, strpos($text, $keyword) + sizeof($keyword) + $size).'...';
$snippet = str_replace($keyword, '<strong>'.$keyword.'</strong>', $snippet);
echo $snippet;
?>
On a larger site I would think that using javascript, something like jquery would be the way to go