views:

39

answers:

1

Hi,

I have a table with 6 columns containing HTML content with some markups in it and now when moving to a new designed site most of this HTML code has to be deleted. More or less all tags except <B> and </B>.

Is there a nice way of doing this, identify all tags end delete them within the data? I'm sure there are no < > symbols in the test so a regular expression would maybe work?

My alternative is to fetch every row, process it and update the database but I'm guessing this is possible to do in T-SQL directly.

My server is an MSSQL 2008 and is located in a hosted environment but I can fetch a local copy if needed.

Thanks, Stefan

+1  A: 

To use Regular Expressions from SQL 2000 http://blogs.msdn.com/b/khen1234/archive/2005/05/11/416392.aspx

And from SQL 2005 up http://weblogs.sqlteam.com/jeffs/archive/2007/04/27/SQL-2005-Regular-Expression-Replace.aspx

Amending that last link gives a Regex that appears to work from my extremely superficial testing on SQL2005 but for strings up to 4000 characters only!

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction(IsDeterministic=true,IsPrecise=true)]
    public static SqlString StripAllButBoldTags(SqlString expression)
    {
        if (expression.IsNull)
            return SqlString.Null;

        Regex r = new Regex("</?([a-z][a-z0-9]*[^<>]*)>", RegexOptions.IgnoreCase);

        return new SqlString(r.Replace(expression.ToString(), new MatchEvaluator(ComputeReplacement)));
    }

    public static String ComputeReplacement(Match m)
    {
        return string.Compare( m.Groups[1].Value, "B",true) == 0? m.Value: "";
    }
};
Martin Smith
I'm doing this on a server in a hosted environment where I have limited rights, can I still do what they are talking about?
StefanE
@Stefan. Obviously depends on your host but I'd imagine that quite likely they wont let you do this. Additionally I had a bit of a play with this and found that when passed strings of greater than 4000 characters it seemed to silently truncate them so all in all I think your proposal to do it outside of SQL Server is more preferable!
Martin Smith
Ok thanks for your help! (And I'm buying a book learning a bit more advanced SQL :) )
StefanE