views:

135

answers:

2

Hi,

I have an HTML string that contains text and images, e.g.

<p>...</p>
<img src="..." />
<p>...</p>

I need to get the src attribute of the first image. The image may or may not come after a <p> and there could be more than one image, or no image at all.

Initially, I tried appending the string to the DOM and filtering. But, if I do this the browser requests all of the external resources. In my situation, this adds a lot of unnecessary overhead.

My initial approach:

var holder = $('<div></div>'); 

holder.html(content);

var src = holder.find('img:first').attr('src'); 

How can I get the src of the first image without appending the HTML? Do I need to use a regular expression, or is there a better way?

The solution needs to be javascript/jQuery based – I'm not using any server side languages.

My question is very similar to: http://forum.jquery.com/topic/parsing-html-without-retrieving-external-resources

Thanks

+2  A: 

This should work:

<html>
<head>
    <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"&gt;&lt;/script&gt;
    <script type="text/javascript">
        //Document Ready: Everything inside this function fires after the page is loaded
        $(document).ready(function () {
                    //Your html string
            var t = "<p><img src='test.jpg'/></p>"
            alert($(t).find('img:first').attr('src'));
        });
    </script>
</head>
<body>
</body>
</html>
Brandon Boone
You just have to take care about having a root element... maybe you must encapsulate your HTML, eg. $('<myRoot>' + htmlStuff + '</myRoot>').find...
hacksteak25
I think this will still have the problem of downloading image files even though you aren't appending the html to the dom. If you create new image elements, browsers download the files right away regardless of whether you append the image elements to the dom.
Dave Aaron Smith
Dave Aaron Smith is right – external resources are still downloaded.
DB
+1  A: 

This

$("<p><blargh src='whatever.jpg' /></p>").find("blargh:first").attr("src")

returns whatever.jpg so I think you could try

$(content.replace("<img", "<blargh")).find("blargh:first").attr("src")
Dave Aaron Smith
Why do you replace img with blargh?
hacksteak25
@DB mentioned that his original approach, loading jQuery objects according to his code, caused the browser to download all the resources. By getting rid of `img` tags I hoped to avoid that problem.
Dave Aaron Smith
This approach works, but I had to do `$(content.replace(/<img/gi, "<blargh")).find("blargh:first").attr("src");` It feels a bit dirty...
DB
@DB, I hear that. Uuuuugly.
Dave Aaron Smith