tags:

views:

63

answers:

2

My input string is:

"<!--<clientHtml>--><br><br><br><b>Job Title:</b> Test text
<br><b>JobId:</b> 56565-116503
<br><br><b>City:</b> San Diego
<br><b>State:</b> CA
<br><b>Zip Code:</b> 92108
<br><br><br><b>Description:</b> 
            We are recruiting for a Controller to oversee all accounting and finance for a growing manufacturing company.  We are looking for someone who is hands on full cycle accounting.  


<br><br>
<!--<apply>test/apply><email></email><OriginalFetchUrl>http:test.xml</OriginalFetchUrl><OriginalWrapUrl>http://test.html&lt;/OriginalWrapUrl&gt;&lt;/clientHtml&gt;--&gt;";

I need to extract following string using C#/Regular expressions:

1."We are recruiting for a Controller to oversee all accounting and finance for a growing manufacturing company. We are looking for someone who is hands on full cycle accounting."

I also want to get rid of the line: test/apply>http:test.xmlhttp://test.html-->

Can I please get help with the code?

Thanks for reading.

A: 

Try something like this: (I didn't test it.)

string result = "";
Match m = Regex.Match(line, @"^\<b\>\s*Description\s*\:\s*\<\/b\>\s*(?<result>.*?)\s*\<", RegexOptions.IgnoreCase);
if (m.Success) 
{
    result = m.Groups["result"].Value;
}
John Fisher
+1  A: 

Try something like this:

Description:</b>([^<]+)

Here is an example of how to use it:

using System;
using System.Text.RegularExpressions;

class Example
{
    static void Main()
    {
     String str = @"<!--<clientHtml>--><br><br><br><b>Job Title:</b> Test text
      <br><b>JobId:</b> 56565-116503
      <br><br><b>City:</b> San Diego
      <br><b>State:</b> CA
      <br><b>Zip Code:</b> 92108
      <br><br><br><b>Description:</b> 
           We are recruiting for a Controller to oversee all accounting and finance for a growing manufacturing company.  We are looking for someone who is hands on full cycle accounting.  


      <br><br>
      <!--<apply>test/apply><email></email><OriginalFetchUrl>http:test.xml</OriginalFetchUrl><OriginalWrapUrl>http://test.html&lt;/OriginalWrapUrl&gt;&lt;/clientHtml&gt;--&gt;";

     Regex expression = new Regex(@"Description:</b>([^<]+)",
      RegexOptions.Compiled |
      RegexOptions.CultureInvariant |
      RegexOptions.IgnoreCase);

     Match match = expression.Match(str);

     if (match.Success)
      Console.WriteLine(match.Groups[1].Value.Trim());
    }
}
Andrew Hare
Thanks.How do use regular expression to achieve following: Input:start<!--abcd-->endOutput shud be: start end
Ed