I have some documents that contain sequences such as radio/tested
that I would like to return hits in queries like
select * from doc
where to_tsvector('english',body) @@ to_tsvector('english','radio')
Unfortunately, the default parser takes radio/tested
as a file
token (despite being in a Windows environment), so it doesn't match the above query. When I run ts_debug
on it, that's when I see that it's being recognized as a file, and the lexeme ends up being radio/tested
rather than the two lexemes radio
and test
.
Is there any way to configure the parser not to look for file
tokens? I tried
ALTER TEXT SEARCH CONFIGURATION public.english
DROP MAPPING FOR file;
...but it didn't change the output of ts_debug
. If there's some way of disabling file
, or at least having it recognize both file
and all the words that it thinks make up the directory names along the way, or if there's a way to get it to treat slashes as hyphens or spaces (without the performance hit of regexp_replace
ing them myself) that would be really helpful.