views:

102

answers:

3
page.getByXPath("//*[@href='http://www.example.com/index.do/abc/1_*'");

Do I need to escape any characters?

I am trying to get all ahref links on the page that have the pattern of:

http://www.example.com/index.do/abc/1_

so these should all be retrieved:

http://www.example.com/index.do/abc/1_asdf-asdfasdf
http://www.example.com/index.do/abc/1_223
http://www.example.com/index.do/abc/1_as.php
http://www.example.com/index.do/abc/1_2222233
+2  A: 

There are no wildcards in XPath. You want something like this instead:

page.getByXPath("//*[contains(@href,'http://www.example.com/index.do/abc/1_')]");

This relies on the contains function. You can also use the starts-with function:

//*[starts-with(@href,'http://www.example.com/index.do/abc/1_')]
Welbog
Yeah, that's why it's in my answer alongside `contains`.
Welbog
A: 

If you are using XPath 1.0, you cannot do wildcard (or regular expression) matches in that way. (Upgrading to 2.0 may allow that)

For this case, I'd suggest doing a 'contains' test for the prefix

//a[contains(@href, 'http://www.example.com/index.do/abc/1_')]

(Note, I limited the select to just a tags)

Aaron
A: 

See if your XPath library supports starts-with(string1,string2) and use:

page.getByXPath("//*[starts-with(@href, 'http://www.example.com/index.do/abc/1_')");

Also, can't you replace * by a ?

bruno conde
I'm using java 1.6
mrblah