Indeed T-SQL does not natively support regular expressions and this is the sort of problem in which regular expressions would be the tool of choice. First, I'll say that the level of complication in the solution depends greatly on how consistent your data is. For example, suppose we search for items with the heading:
Select ..
From ...
Where HtmlContent Like '<span class="heading-2">%'
This assumes no additional spacing between span
and class
as well as no additional spacing after the final double quote before the end bracket. We could write '%<span%class="heading-2"%>%'
to account for the spaces but that would also find div
tags marked as heading-2
in the same content as any span tag. If this later scenario shouldn't happen but you might have varying spaces, then use this revised pattern. Where we will really run into troubles is the closing tag. Suppose our content looks like so:
<span class="heading-2"> Foo <span class="heading-3">Bar</span> And Gamma Too</span> .... <span class="heading-4">Fubar Is the right way!</span>...
It is not so simple to find the correct closing span
tag to change to </h2>
. You cannot simply find the first </span>
and change it to </h2>
. If you knew that you had no nested span
tags, then you could write a user-defined function that would do it:
Create Function ReplaceSpanToH2( @HtmlContent nvarchar(max) )
Returns nvarchar(max)
As
Begin
Declare @StartPos int
Declare @EndBracket int
Set @StartPos = CharIndex('<span class="heading-2">', @HtmlContent)
If @StartPos = 0
Return @HtmlContent
Set @HtmlContent = Replace(@HtmlContent, '<span class="heading-2">', '<h2>')
-- find next </span>
Set @StartPos = CharIndex('</span>', @HtmlContent, @StartPos)
Set @HtmlContent = Stuff(@HtmlContent, @StartPos, 7, '</h2>')
Return @HtmlContent
End