tags:

views:

79

answers:

1

Hello ,

I using tdom version 0.8.2 to parse html pages.

From the help pages I found the following commands to get the ElementById

TCL code

set html {<html>
<head>
</head>
<body>
<div id="m"> 
</div>
</body>
</html>
}
package require tdom
set doc [ dom parse -html $html ] 
set node  [ $doc getElementById m]

But when I execute the second set command I get a empty string . But cleary the tag has an id of m . Can someone tell where am I going wrong ?

Regards, Mithun

+3  A: 

The problem is that your document lacks a <!DOCTYPE> declaration, so tDOM doesn't know that the id element is to be interpreted as an ID.

If we add a DOCTYPE, it all works...

package require tdom
set html {<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN">
<html>
  <head>
  </head>
  <body>
    <div id="m"> 
    </div>
  </body>
</html>}
set doc [ dom parse -html $html ] 
set node  [ $doc getElementById m]
puts [$node asList]

Produces this output for me:

div {id m} {}

You could have checked that the document was being parsed at all by doing a search to see if the element is findable using XPath, like this:

puts [[$doc selectNodes "//*\[@id\]"] asList]

Since that did produce the right output (as above) it was clear that the problem had to be in the interpretation of the attribute, which in turn pointed straight at the missing DOCTYPE.


Update

It's actually a bug that was fixed in tDOM 0.8.3.

Donal Fellows
Thanks for the reply.Which tdom version are you using ?For me this still does not work , as in "puts [[$doc selectNodes "//*\[@id\]"] asList] " the commands produces the same output . But the set node [$doc getElementById m] command does not work .
mithunmo
You need tDOM 0.8.3. Apparently (according to the changelog) it was a bug fixed on 2007-10-30.
Donal Fellows