ansaurus

Question

Answer 1

+1 A:

I always had the same issue with selenium 1, I improved it by updating the 3rd party xpath library which it used not sure if this still applies to selenium 2... but ultimately without it being native to the browser it wasn't quick enough.

In the end if I needed to something like your example and CSS selectors just wouldn't cut it I'd just return the entire DOM from selenium and parse the tree in code using another library and iterate through it that way. Bit of a dirty hack but does get round you using slow IE xpath.

Bill 2010-09-14 07:17:24

In this particular example you are trying to get the text of both td and th of the table. Have you tried using two loops, one for row.FindElements(By.TagName("th")) and second for row.FindElements(By.TagName("td"))?

ZloiAdun 2010-09-14 08:30:27

Answer 2

+1 A:

If you have access to change the HTML, try putting in a class declaration on the table data elements. Then you could use By.ClassName instead of XPath.

But before I go any further, what exactly are you trying to do? It seems odd that

Once CssSelectors is fully supprted in .Net and IE it'll be a great option, but for now it's not reliable. Remember for now, your document needs to be rendered in Standards mode.

You'll want to consider looking at just td and not td and th. While it's certainly doable, it adds a certain amount of complexity. I've done that below for simplicities sake. Typically you'd know how many th there are and what they hold, and deal with them separately.

Getting onto the code I found there was a slight speedup going to By.TagName. This took about 20 seconds over 43 rows by 4 columns.

        IWebElement table = driver.FindElement(By.TagName("table"));
        ReadOnlyCollection<IWebElement> cells = table.FindElements(By.TagName("td"));
        foreach (IWebElement cell in cells)
        {
            Console.WriteLine(cell.Text);
        }

But then I tried loading the page source into memory and parsing the page using the HtmlAgilityPack. Be wary of using XML parsers to read html docs, you'll find html may not be perfect XML. The following code took and almost obscene 96 milliseconds

        HtmlDocument html = new HtmlDocument();
        html.LoadHtml(driver.PageSource);
        HtmlNodeCollection nodeCollect =  html.DocumentNode.SelectNodes("//td");
        foreach (HtmlNode node in nodeCollect)
        {
            Console.WriteLine(node.InnerText);
        }

Go with loading page source and parsing, if all you want to do it iterate through a document checking elements. Revert back to your driver when you need to navigate/interact.

pnewhook 2010-09-14 22:31:13

+1 for the recommendation to parse the source for getting text and using the driver for interactions.

Tom E 2010-09-15 13:19:27

Thanks for the recommendation on htmlagility, I've re-coded my classes to use it and everything is much faster

Mikey 2010-09-16 01:53:53

ansaurus

tags:

views:

answers:

Selenium 2.0 IE Xpath Performance

related questions