views:

46

answers:

1

Hi,

I have a database containing Page ojects with html content. A lot of the rows in the db contain this content

  <p style="float: left; margin-right: 20px; height: 300px;">
        <img src="...">More html ...
 </p>

So I created a super simple regex replace:

 foreach (var page in db.Pages)
                {
                    string pattern = @"<p style=""float: left; margin-right: 20px;"">(.*)</p>/ms";
                    if( Regex.Match(page.Content, pattern).Success)
                    {
                        page.Content = Regex.Replace(page.Content, pattern, "<div class=\"contentimage\" >$1</div>");
                    }
                }
//                db.SubmitChanges();

Altough when I run the regex in a regex testing tool, it works. but in c# code it doesn't. Can anyone help me out please.

If anyone know how to do an update with the regex replace in sql, that would be fine to.

Regex isn't my strongest point (altough a great shame). But it is on my list of things to learn asap ;)

+2  A: 

Your problem is "/ms". You're trying to specify a couple of regex flags, but C# specifies flags differently than php/perl (your regex tester probably tests regexes aimed at those languages. I suggest Expresso (it's free) for working with .NET regexes). Change your pattern to this:

string pattern = @"<p style=""float: left; margin-right: 20px; height: 300px;"">(.*)</p>";

(also note that I added the "height" attribute in order to make it match -- was that just a typo?)

And your regex instantiation to this:

if( Regex.Match(page.Content, pattern,RegexOptions.Multiline | RegexOptions.Singleline).Success)

And it should work.

[EDIT] Oh, and fixing the replace method:

page.Content = Regex.Replace(page.Content, pattern, "<div class=\"contentimage\" >$1</div>", RegexOptions.Multiline | RegexOptions.Singleline);
NickAldwin
And I completely agree with Marc that unless your HTML is always going to be very similar to your example, Regex is not really the way to go.
NickAldwin
Thanks alot, worked like a charm. And @Marc Gravell: Regex was the right tool for this job. Try putting this in less then 10 lines with a html parser :D this works like a charm, ergo: regex 1 - htmlparser 0 ;)I wasn't a fan off regex miself, but more and more I am becoming one
Nealv
Well as long as the HTML is always going to be perfectly formed like this, it'll work OK. Any other case, though, and it'll break.
NickAldwin
It worked great, and offcourse I already figured the replace out ;) great help, thanks again
Nealv