views:

114

answers:

2

So I am grabbing RSS feeds via AJAX. After processing them, I have a html string that I want to manipulate using various jQuery functionality. In order to do this, I need a tree of DOM nodes.

I can parse a HTML string into the jQuery() function.
I can add it as innerHTML to some hidden node and use that.
I have even tried using mozilla's nonstandard range.createContextualFragment().

The problem with all of these solutions is that when my HTML snippet has an tag, firefox dutifully fetches whatever image is referenced. Since this processing is background stuff that isn't being displayed to the user, I'd like to just get a DOM tree without the browser loading all the images contained in it.

Is this possible with javascript? I don't mind if it's mozilla-only, as I'm already using javascript 1.7 features (which seem to be mozilla-only for now)

+2  A: 

The obvious answer is to parse the string and remove the src attributes from img tags (and similar for other external resources you don't want to load). But you'll have already thought of that and I'm sure you're looking for something less troublesome. I'm also assuming you've already tried removing the src attribute after having jquery parse the string but before appending it to the document, and found that the images are still being requested.

I'm not coming up with anything else, but you may not need to do full parsing; this replacement should do it in Firefox with some caveats:

thestring = thestring.replace("<img ", "<img src='' ");

The caveats:

  • This appears to work in the current Firefox. That doesn't meant that subsequent versions won't choose to handle duplicated src attributes differently.
  • This assumes the literal string "general purpose assumption, that string could appear in an attribute value on a sufficiently...interesting...page, especially in an inline onclick handler like this: <a href='#' onclick='$("frog").html("<img src=\"spinner.gif\">")'> (Although in that example, the false positive replacement is harmless.)

This is obviously a hack, but in a limited environment with reasonably well-known data...

T.J. Crowder
@T. J. - You're right, works in every browser except firefox, seeing if there's another way. Also to make yours more robust, I'd suggest just `src=` replaced with `blah=`, this would eliminate javascript fetches too.
Nick Craver
@Nick: The parse-then-remove works except in FF? Heh. Classic, everything but the one browser the OP wanted to use. :-) I didn't try to muck about with the `src=` because it makes the replacement *much* more complicated, have to be sure that it's appearing inside a tag, etc., etc.
T.J. Crowder
@T.J. no no, my solution worked everywhere except FF which is why I didn't see, but yes same irony :)
Nick Craver
@Nick: Sorry, wasn't clear, that's what I meant. :-)
T.J. Crowder
cheers :) I've ended up modifying src= to _src=, since I want to (at some point) reverse the process and get the image urls back. And given that I'm reversing it before it is eventually displayed, the false positives should be negligible.
gfxmonk
+2  A: 

You can use the DOM parser to manipulate the nodes. Just replace the src attributes, store their original values and add them back later on.

Sample:

    (function () {
        var s = "<img src='http://www.google.com/logos/olympics10-skijump-hp.png' /><img src='http://www.google.com/logos/olympics10-skijump-hp.png' />";
        var parser = new DOMParser();
        var dom = parser.parseFromString("<div id='mydiv' >" + s + "</div>", "text/xml");
        var imgs = dom.getElementsByTagName("img");
        var stored = [];
        for (var i = 0; i < imgs.length; i++) {
            var img = imgs[i];
            stored.push(img.getAttribute("src"));
            img.setAttribute("myindex", i);
            img.setAttribute("src", null);
        }
        $(document.body).append(new XMLSerializer().serializeToString(dom));
        alert("Images appended");
        window.setTimeout(function () {
            alert("loading images");
            $("#mydiv img").each(function () {
                this.src = stored[$(this).attr("myindex")];
            })
            alert("images loaded");
        }, 2000);
    })();
andras
Thanks, that's a great answer. The only problem (for my case) is that it only supports valid XML, which is probably not going to work for arbitrary RSS feed content (how I wish it would). But for others if you can ensure valid XML, you ought to use this ;)
gfxmonk
"It is very easy to parse RSS feeds with Javascript, since RSS feeds are just plain XML."From "Parsing RSS feeds with AJAX/Javascript":http://www.captain.at/howto-ajax-parse-rss.php:-)
andras
@andras: yes, the RSS feed is valid XML. However the entry contents is just CDATA containing whatever mish-mash of HTML the author published as the "contents" of the entry. That is (sadly) the part I wish to parse.
gfxmonk