I'm building a basic search functionality, using LIKE (I'd be using fulltext but can't at the moment) and I'm wondering if MySQL can, on searching for a keyword (e.g. WHERE field LIKE '%word%') return 20 words either side of the keyword, as well?
Use the INSTR() function to find the position of the word in the string, and then use SUBSTRING() function to select a portion of characters before and after the position.
You'd have to look out that your SUBSTRING instruction don't use negative values or you'll get weird results.
Try that, and report back.
I don't think its possible to limit the number of words returned, however to limit the number of chars returned you could do something like
SELECT SUBSTRING(field_name, LOCATE('keyword', field_name) - chars_before, total_chars) FROM table_name WHERE field_name LIKE "%keyword%"
- chars_before - is the number of chars you wish to select before the keyword(s)
- total_chars - is the total number of chars you wish to select
i.e. the following example would return 30 chars of data staring from 15 chars before the keyword
SUBSTRING(field_name, LOCATE('keyword', field_name) - 15, 30)
Note: as aryeh pointed out, any negative values in SUBSTRING() buggers things up considerably - for example if the keyword is found within the first [chars_before] chars of the field, then the last [chars_before] chars of data in the field are returned.
I think your best bet is to get the result via SQL query and apply a regular expression programatically that will allow you to retrieve a group of words before and after the searched word.
I can't test it now, but the regular expression should be something like:
.*(\w+)\s*WORD\s*(\w+).*
where you replace WORD
for the searched word and use regex group 1 as before-words, and 2 as after-words
I will test it later when I can ask my RegexBuddy if it will work :) and I will post it here
You can do it all in the query using SUBSTRING_INDEX
CONCAT_WS(
' ',
-- 20 words before
TRIM(
SUBSTRING_INDEX(
SUBSTRING(field, 1, INSTR(field, 'word') - 1 ),
' ',
-20
)
),
-- your word
'word',
-- 20 words after
TRIM(
SUBSTRING_INDEX(
SUBSTRING(field, INSTR(field, 'word') + LENGTH('word') ),
' ',
20
)
)
)