ansaurus

Question

create a dictionary or list from string(HTML tag included) in C#

Answer 1

+10 A:

For example: (Tested)

var doc = new HtmlDocument();
doc.LoadHtml(s);
var dict = doc.DocumentNode.Descendants("tr")
              .ToDictionary(
                  tr => int.Parse(tr.Descendants("td").First().InnerText),
                  tr => int.Parse(tr.Descendants("td").Last().InnerText)
              );

If the HTML will always be well-formed, you can use LINQ-to-XML; the code would be almost identical.

SLaks 2010-03-07 19:49:43

very helpful tips and answers. and I learn about HTML Agility Pack. Best solution for tasks like this. Thanks

loviji 2010-03-07 20:17:37

Answer 2

A:

If you don't want to use the HTML agility pack you could try something similar to:

var arr = s.Replace("<tr>", "").Split("</tr", StringSplitOptions.RemoveEmptyEntries);

var d = new Dictionary<int, int>();
foreach (var row in arr) {
  var itm = row.Replace("<td>", "").Split("</td>", StringSplitOptions.RemoveEmptyEntries);
  d.Add(int.Parse(itm[0]), int.Parse(itm[1]);
}

(untested)

Sani Huttunen 2010-03-07 19:54:08

As Andrew M mentioned, that way lies madness. It's equivalent to using regular expressions. http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

TrueWill 2010-03-07 19:59:28

Answer 3

+3 A:

Code

using RE=System.Text.RegularExpressions;

....

public void Run()
{
    string s=@"
<tr>
<td>11</td><td>12</td>
</tr>
<tr>
<td>21</td><td>22</td>
</tr>
<tr>
<td>31</td><td>32</td>
</tr>";

    var mcol= RE.Regex.Matches(s,"<td>(\\d+)</td><td>(\\d+)</td>");
    var d = new Dictionary<int, int>();

    foreach(RE.Match match in mcol)
        d.Add(Int32.Parse(match.Groups[1].Value),
              Int32.Parse(match.Groups[2].Value));

    foreach (var key in d.Keys)
        System.Console.WriteLine("  {0}={1}", key, d[key]);
}

Cheeso 2010-03-07 19:56:48

You should not do this. If you do, you should probably ignore whitespace in the tags. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

SLaks 2010-03-07 20:03:41

Maybe, but it worked for his HTML. I think you're saying that it won't work for other HTML. Fair point.

Cheeso 2010-03-07 21:10:23

Answer 4

A:

var s = "<tr><td>11</td><td>12</td></tr><tr><td>21</td><td>22</td></tr><tr><td>31</td><td>32</td></tr>";

var rows = s.Split( new[] { "</tr>" }, StringSplitOptions.None );

var results = new Dictionary<int, int>();
foreach ( var row in rows )
{
    var cols = row.Split( new[] { "</td>" }, StringSplitOptions.None );
    var vals = new List<int>();

    foreach ( var col in cols )
    {
        var val = col.Replace( "<td>", string.Empty ).Replace( "<tr>", string.Empty );

        int intVal;
        if ( int.TryParse( val, out intVal ) )
            vals.Add( intVal );
    }

    if ( vals.Count == 2 )
        results.Add( vals[0], vals[1] );
}

Thomas 2010-03-07 20:01:28

Answer 5

+1 A:

string s =
@"<tr> 
<td>11</td><td>12</td> 
</tr> 
<tr> 
<td>21</td><td>22</td> 
</tr> 
<tr> 
<td>31</td><td>32</td> 
</tr>";

XPathDocument doc = new XPathDocument(XmlReader.Create(new StringReader(s), new XmlReaderSettings { ConformanceLevel = ConformanceLevel.Fragment, IgnoreWhitespace = true }));

Dictionary<int, int> dict = doc.CreateNavigator()
   .Select("tr")
   .Cast<XPathNavigator>()
   .ToDictionary(
      r => r.SelectSingleNode("td[1]").ValueAsInt,
      r => r.SelectSingleNode("td[2]").ValueAsInt
   );

Max Toro 2010-03-07 20:04:51

Answer 6

A:

using RE=System.Text.RegularExpressions;

....

public void Run() { string s=@" 1112 2122 3132 ";

var mcol= RE.Regex.Matches(s,"<td>(\\d+)</td><td>(\\d+)</td>"); 
var d = new Dictionary<int, int>(); 

foreach(RE.Match match in mcol) 
    d.Add(Int32.Parse(match.Groups[1].Value), 
          Int32.Parse(match.Groups[2].Value)); 

foreach (var key in d.Keys) 
    System.Console.WriteLine("  {0}={1}", key, d[key]);

}

2010-03-08 09:29:03

ansaurus

tags:

views:

answers:

create a dictionary or list from string(HTML tag included) in C#

related questions