views:

59

answers:

4

I'm tryng to verify if all my page links are valid, and also something similar to me if all the pages have a specified link like contact. i use python unit testing and selenium IDE to record actions that need to be tested. So my question is can i verify the links in a loop or i need to try every link on my own? i tried to do this with __iter__ but it didn't get any close ,there may be a reason that i'm poor at oop, but i still think that there must me another way of testing links than clicking them and recording one by one.

A: 

What exactly is "Testing links"?

If it means they lead to non-4xx URIs, I'm afraid You must visit them.

As for existence of given links (like "Contact"), You may look for them using xpath.

Almad
+1  A: 

Though the tool is in Perl, have you checked out linklint? It's a tool which should fit your needs exactly. It will parse links in an HTML doc and will tell you when they are broken.

If you're trying to automate this from a Python script, you'd need to run it as a subprocess and get the results, but I think it would get you what you're looking for.

bedwyr
i need to do more than just verify the link ,rather i thought of putting all the links on a page in a list and after that use the list to verify all the elements of a page
decebal
A: 

I would just use standard shell commands for this:

  • You can use wget to detect broken links
  • If you use wget to download the pages, you can then scan the resulting files with grep --files-without-match to find those that don't have a contact link.

If you're on windows, you can install cygwin or install the win32 ports of these tools.

Wim Coenen
good idea using this spider , thanks
decebal
A: 

You could (as yet another alternative), use BeautifulSoup to parse the links on your page and try to retrieve them via urllib2.

Wayne Werner