views:

185

answers:

2

I'm attempting to use selenium-dotnet-2.0a5 to iterate through many tables, and have to use xpath. e.g;

var tableRows = _table.FindElements(By.TagName("tr"));

foreach (var row in tableRows)
{ 
    row.FindElements(By.XPath("td|th"));
    //iterate through tablecells and get text of each
}

Average times to iterate through about 50 rows, firefox 0-2 sec, chrome 6-8 sec, IE 60-70 sec.

Most of my tests need to be run in IE, any tips on what can I do to get better xpath performance?

+1  A: 

I always had the same issue with selenium 1, I improved it by updating the 3rd party xpath library which it used not sure if this still applies to selenium 2... but ultimately without it being native to the browser it wasn't quick enough.

In the end if I needed to something like your example and CSS selectors just wouldn't cut it I'd just return the entire DOM from selenium and parse the tree in code using another library and iterate through it that way. Bit of a dirty hack but does get round you using slow IE xpath.

Bill
In this particular example you are trying to get the text of both td and th of the table. Have you tried using two loops, one for row.FindElements(By.TagName("th")) and second for row.FindElements(By.TagName("td"))?
ZloiAdun
+1  A: 

If you have access to change the HTML, try putting in a class declaration on the table data elements. Then you could use By.ClassName instead of XPath.

But before I go any further, what exactly are you trying to do? It seems odd that

Once CssSelectors is fully supprted in .Net and IE it'll be a great option, but for now it's not reliable. Remember for now, your document needs to be rendered in Standards mode.

You'll want to consider looking at just td and not td and th. While it's certainly doable, it adds a certain amount of complexity. I've done that below for simplicities sake. Typically you'd know how many th there are and what they hold, and deal with them separately.

Getting onto the code I found there was a slight speedup going to By.TagName. This took about 20 seconds over 43 rows by 4 columns.

        IWebElement table = driver.FindElement(By.TagName("table"));
        ReadOnlyCollection<IWebElement> cells = table.FindElements(By.TagName("td"));
        foreach (IWebElement cell in cells)
        {
            Console.WriteLine(cell.Text);
        }

But then I tried loading the page source into memory and parsing the page using the HtmlAgilityPack. Be wary of using XML parsers to read html docs, you'll find html may not be perfect XML. The following code took and almost obscene 96 milliseconds

        HtmlDocument html = new HtmlDocument();
        html.LoadHtml(driver.PageSource);
        HtmlNodeCollection nodeCollect =  html.DocumentNode.SelectNodes("//td");
        foreach (HtmlNode node in nodeCollect)
        {
            Console.WriteLine(node.InnerText);
        }

Go with loading page source and parsing, if all you want to do it iterate through a document checking elements. Revert back to your driver when you need to navigate/interact.

pnewhook
+1 for the recommendation to parse the source for getting text and using the driver for interactions.
Tom E
Thanks for the recommendation on htmlagility, I've re-coded my classes to use it and everything is much faster
Mikey