views:

60

answers:

3

I've been looking around and trying to find a way to click on a link in selenium that's matched by a regexp.

Here is the code that works;

from selenium import selenium
sel = selenium("localhost", 4444, "*chrome", "http://www.ncbi.nlm.nih.gov/")
sel.start()
sel.open('/pubmed')
sel.type("search_term", "20032207[uid]")
sel.click("search")
sel.click("linkout-icon-unknown-vir_full")

However if I search across different IDs the link-text will be different but it always matches the regexp linkout-icon[\w-_]*.

But I can't seem to find the right command for clicking a link which matches a regexp ... I've tried:

sel.click('link=regex:linkout-icon[\w-_]*')
sel.click('regex:linkout-icon[\w-_]*')
sel.click('link=regexp:linkout-icon[\w-_]*')
sel.click('regexp:linkout-icon[\w-_]*')

But none of them seem to work at all. Any suggestions?

EDIT:

So after comments in an answer below: The clicked item is actually an image with the id=linkout-icon-unknown-viro_full. The full line is below:

<a href="http://vir.sgmjournals.org/cgi/pmidlookup?view=long&amp;amp;pmid=20032207" ref="PrId=3051&amp;itool=Abstract-def&amp;uid=20032207&amp;nlmid=0077340&amp;db=pubmed&amp;log$=linkouticon" target="_blank"><img alt="Click here to read" id="linkout-icon-unknown-vir_full" border="0" src="http://www.ncbi.nlm.nih.gov/corehtml/query/egifs/http:--highwire.stanford.edu-icons-externalservices-pubmed-standard-vir_full.gif" /></a> </div>

If your wondering I got the code from the Selenium IDE recorder.

A: 

I think you are very close. First, regexp: is the right text pattern for saying you want to use a regular expression.

The other thing that probably isn't quite right is saying link=, as that refers to the text of the link, i.e.:

<a href="path/to/mylink">Text of the link, this is what will be searched</a>

So what part of the anchor do you want to use your regular expression on, the href?

Something that might lead towards the right answer is this: http://stackoverflow.com/questions/2007367/selenium-is-it-possible-to-use-the-regexp-in-selenium-locators/2007531#2007531

Maybe that get function could be repurposed to search all a.href properties for your regexp and then return the XPath of each of them to then be fed to click()

Ryley
+1  A: 

sel.click can take an XPath as an argument. Using Firebug I found (what I believe is) the XPath to "linkout-icon-unknown-vir_full" link:

sel.click("//*[@id='linkout-icon-unknown-vir_full']")

Using the above command takes me to this page.


I wasn't able to get matches to work -- I'm not sure why -- but this seems to work using contains:

sel = selenium.selenium("localhost", 4444, "*firefox", "http://www.ncbi.nlm.nih.gov/")
sel.start()
sel.open('/pubmed')
sel.type("search_term", "20032207[uid]")
sel.click("search")
sel.wait_for_page_to_load(30000)
sel.click("//*[contains(@id,'linkout')]")
unutbu
Right idea but I need a regexp since I'm going to be taking these from lists of searches. For different searches I'll need to match different links.
JudoWill
perfect, that works sooo much better than my solution!
JudoWill
A: 

After doing some hacking around I've come up with probably the most asinine way to do it but it works until someone can provide me with a better answer:

import re
val = re.findall('linkout-icon-unknown[\w-]*', sel.get_html_source())[0]
sel.click(val)

It requires me to search the entire html and will likely come up with issues if the design changes.

I'd love to see a more robust method.

JudoWill