tags:

views:

34

answers:

1

I am using perl module HTML::DOM (link to CPAN) for building HTML DOM tree from HTML code and then changing it using standard DOM's removeAttribute, removeChild, innerHTML, createElement and so on.

But, I have found out it's really, really slow and eating too much memory (it's fully in perl, anyway). So, I thought that there will be some C/C++ library that does it faster and more efficiently (because it happens in every browser that have JavaScript support).

So far, I have not found anything. Maybe I am searching wrong?

edit: I will add. I would like it if it worked similarly to linked Perl module - by that, I really mean so I could use directly HTML's innerHTML, className, idName... Is it posible, or will I need to use general XML parser and then write these by myself?

edit2: OK, the slowness of the Perl module was actually my fault entirely. However, since I already asked, the question still stands :)

+1  A: 

libgdome is a library adding a DOM implementation on top of libxml2.

Many of the faster higher-level language modules for this purpose (such as, in the Python world, lxml) tend to be built directly on libxml2, doing the DOM bits themselves.

Charles Duffy
Thanks. However, I can't find things like "className" or "innerHTML" etc. in the documentation...
Karel Bílek