ansaurus

Question

How to get the html source of a specific element with selenium?

Answer 1

+1 A:

What about using jQuery?

Edit:

First you have to add the required .JS files, for that go to www.jQuery.com.

Then all you need to do is call a simple jQuery selector:

alert($("div#1").html());

hminaya 2009-11-29 18:07:07

I don't know jQuery. Can yo give me an example?Thanks!

Rivka 2009-11-29 18:08:33

Answer 2

+3 A:

Use xpath. From selenium.py:

Without an explicit locator prefix, Selenium uses the following default strategies:

\**dom**\ , for locators starting with "document."

\**xpath**\ , for locators starting with "//"

\**identifier**\ , otherwise

In your case, you could try

selenium.get_text("//div[@id='1']/descendant::*[not(self::h1)]")

You can learn more about xpath here.

P.S. I don't know if there's good HTML documentation available for python-selenium, but I haven't found any; on the other hand, the docstrings of the selenium.py file seem to constitute comprehensive documentation. So I'd suggest looking up the source to get a better understanding of how it works.

int3 2009-11-29 18:14:55

Answer 3

+2 A:

The following code will give you the HTML in the div element:

sel = selenium('localhost', 4444, browser, my_url)
html = sel.get_eval("this.browserbot.getCurrentWindow().document.getElementById('1').innerHTML")

then you can use BeautifulSoup to parse it and extract what you really want.

I hope it helps

luc 2009-11-29 20:48:21

Thanks! It solved the problem :)

Rivka 2009-11-30 07:39:21

so why don't you accept the response? :)

luc 2009-11-30 08:03:47

sorry, I'm new in this site...You meant clicking on the v, right ?

Rivka 2009-11-30 08:17:17

No problem. Thanks. I spend some times a few weeks ago on a similar problem and I am happy to know that it fixed yours too.

luc 2009-11-30 08:35:33

and welcome to stackoverflow :)

luc 2009-11-30 08:46:35

ansaurus

tags:

views:

answers:

How to get the html source of a specific element with selenium?

related questions