Hi
I would like to parse out any HTML data that is returned wrapped in CDATA.
As an example <![CDATA[<table><tr><td>Approved</td></tr></table>]]>
Thanks!
Hi
I would like to parse out any HTML data that is returned wrapped in CDATA.
As an example <![CDATA[<table><tr><td>Approved</td></tr></table>]]>
Thanks!
I know this might seem incredibly simple, but have you tried string.Replace()?
string x = "<![CDATA[<table><tr><td>Approved</td></tr></table>]]>";
string y = x.Replace("<![CDATA",string.Empty).Replace("]]>", string.Empty);
There are probably more efficient ways to handle this, but it might be that you want something that easy...
Not much detail, but a very simple regex should match it if there isn't complexity that you didn't describe:
/<!\[CDATA\[(.*?)\]\]>/
The regex to find CDATA sections would be:
(?:<!\[CDATA\[)(.*?)(?:\]\]>)
The expression to handle your example would be
\<\!\[CDATA\[(?<text>[^\]]*)\]\]\>
Where the group "text" will contain your HTML.
The C# code you need is:
using System.Text.RegularExpressions;
RegexOptions options = RegexOptions.None;
Regex regex = new Regex(@"\<\!\[CDATA\[(?<text>[^\]]*)\]\]\>", options);
string input = @"<![CDATA[<table><tr><td>Approved</td></tr></table>]]>";
// Check for match
bool isMatch = regex.IsMatch(input);
if( isMatch )
Match match = regex.Match(input);
string HTMLtext = match.Groups["text"].Value;
end if
The "input" variable is in there just to use the sample input you provided