A: 

You could loop over the html string to detect the angle brackets and build up an array of tags and whether there was a matching closing tag for each one. The problem is, HTML allows for non closing tags, such as img, br, meta - so you'd need to know about those. You would also need to have rules to check the order of closing, because just matching an open with a close doesn't make valid HTML - if you open a div, then a p and then close the div and then close the p, that isn't valid.

Sohnee
can you please give me some sample code?
+1  A: 

Your requirement is very unclear so most of this is guesswork. Also, you have provided no code which would help to clarify what it is you want to do.

One solution could be:

a. Find the text between the <p> and the </p> tags. You can use the following Regex for this or use a simple string search:

\<p\>(.*?)\</p\>

b. In the found text, apply a Substring() to extract the required text.

c. Put back the extracted text between the <p> and the </p> tags.

Cerebrus
But i think he has just given P tag as an example. He might have to pull out substring from any type of tag.
rahul
Yes, Now i modified the question to make more clear
@phoenix: Your intuition is quite possibly true.
Cerebrus
+2  A: 

You need to teach your code how to understand that your string is actually HTML or XML. Just treating it like a string won't allow you to work with it the way you want to. This means first transforming it to the correct format and then working with that format.

Use an XSL stylesheet

If your HTML is well-formed XML, load it into an XMLDocument and run it through an XSL stylesheet that does something like the following:

<xsl:template match="p">
  <xsl:value-of select="substring(text(), 0, 10)" />
</xsl:template>

Use an HTML parser

If it's not well-formed XML (as in your example, where you have a sudden </p> in the middle), you'll need to use a HTML parser of some kind, such as HTML Agility Pack (see this question about C# HTML parsers).

Don't use regular expressions, since HTML is too complex to parse using regex.

Rahul