views:

374

answers:

1

Hi there,

I have this

The body:

<body><p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent leo leo, ultrices eu venenatis et, rutrum fringilla dolor.</p></body>

The code:

HtmlNode body = doc.DocumentNode.SelectSingleNode("//body");

Dictionary<HtmlNode, HtmlNode> toReplace = new Dictionary<HtmlNode, HtmlNode>();

// I do some logic here adding nodes to the toReplace dictionary.

foreach (HtmlNode replaceNode in toReplace.Keys)
{
    replaceNode.ParentNod.ReplaceChild(toReplace[replaceNode], replaceNode);
}

After i do this, the InnerHtml of the body node remains the same as from beginning, although the OutterHtml or the InnerText are showing the good result. Is there something wrong with my code?

The result:

// body.InnerHtml
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent leo leo, ultrices eu venenatis et, rutrum fringilla dolor.</p>

// body.OutterHtml
<body><p>Lorem ipsum dolor sit amet...</p></body>
A: 

I think it may be something to do with the way you are adding nodes to replace old nodes. See if this solution works for you to truncate the text node. I did a quick test and all three gave me the same results.

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlString);
HtmlNode body = doc.DocumentNode.SelectSingleNode("//body");

foreach (var paragraph in body.Descendants("p"))
{
    paragraph.InnerHtml = paragraph.InnerHtml.Substring(0, 25) + "...";
}

Console.WriteLine(body.InnerHtml);
Console.WriteLine(body.InnerText);
Console.WriteLine(body.OuterHtml); 
Rohit Agarwal
It's true, it had something to do with what I did to the body before replacing the nodes. I did some work on other bugs and the good thing is that now this works. Unfortunately, I don't know why :)
morsanu