views:

230

answers:

1

I have the following query in which I am searching within the XMLData type column.

I want to return a substring of a node that either starts with the search criteria and X number of characters afterward ending on a full word or a substring which places the search criteria in the middle of the result with X number of characters before and after starting/ending on a new word.

The reason for the 2 notions is that the search criteria might be at the beginning of the node hence X number of characters afterward or if is deep in the text show a before/after characters.

My query seems to be starting with a new word but I cant suss out ending on it, I had a go a reversing the string, doing a patindex on it then doing length of search - patindex but that didn't seem to work.

Thanks

SELECT 
P.Title,
SUBSTRING(DATA.value('(/PageContent/Text)[1]', 'VARCHAR(100)'),PATINDEX('%north%',DATA.value('(/PageContent/Text)[1]', 'VARCHAR(100)')) - 20 + PATINDEX('% %',SUBSTRING(DATA.value('(/PageContent/Text)[1]', 'VARCHAR(100)'),PATINDEX('%north%',DATA.value('(/PageContent/Text)[1]', 'VARCHAR(100)')) - 20,999)),999) AS Data

FROM WEBPAGECONTENT W

INNER JOIN WebPage P
ON P.ID = W.PageID

WHERE COALESCE(PATINDEX('%north%',DATA.value('(/PageContent/Text)[1]', 'VARCHAR(100)')),0) > 0
A: 

Give this a go. I've replaced your XML query with simple variables, but the basic layout of the statement remains the same. It's a bit of a bear to work with though. Also, depending on your data and requirements you might want to change the search for a single space to include other characters, like '.' and ',' as potential ends of words.

As much as I'd like to explain every bit of the code, it's past my lunch time, so I'll leave that as an exercise for the reader ;)

DECLARE
 @search_string VARCHAR(20),
 @string   VARCHAR(1000)

SET @string = 'This is a test north this is only a test'
SET @search_string = 'north'

SELECT
 SUBSTRING(@string,
 CASE
  WHEN PATINDEX('%' + @search_string + '%', @string) <= 20 THEN 1
  WHEN PATINDEX('% %', SUBSTRING(@string, PATINDEX('%' + @search_string + '%', @string) - 20, 20)) <> 0
   THEN PATINDEX('% %', SUBSTRING(@string, PATINDEX('%' + @search_string + '%', @string) - 20, 20))
  ELSE PATINDEX('%' + @search_string + '%', @string)
 END,
 CASE
  WHEN PATINDEX('%' + @search_string + '%', @string) + LEN(@search_string) >= LEN(@string) - 20 THEN 1000
  WHEN PATINDEX('% %', SUBSTRING(@string, PATINDEX('%' + @search_string + '%', @string) + LEN(@search_string), 20)) <> 0
   THEN PATINDEX('% %', SUBSTRING(@string, PATINDEX('%' + @search_string + '%', @string) + LEN(@search_string), 20))
  ELSE 0
 END + LEN(@search_string) +
 (PATINDEX('%' + @search_string + '%', @string) - CASE
  WHEN PATINDEX('%' + @search_string + '%', @string) <= 20 THEN 1
  WHEN PATINDEX('% %', SUBSTRING(@string, PATINDEX('%' + @search_string + '%', @string) - 20, 20)) <> 0
   THEN PATINDEX('% %', SUBSTRING(@string, PATINDEX('%' + @search_string + '%', @string) - 20, 20))
  ELSE PATINDEX('%' + @search_string + '%', @string)
 END)
)
Tom H.