I'm tryng to verify if all my page links are valid, and also something similar to me if all the pages have a specified link like contact. i use python unit testing and selenium IDE to record actions that need to be tested.
So my question is can i verify the links in a loop or i need to try every link on my own?
i tried to do this with __iter__
but it didn't get any close ,there may be a reason that i'm poor at oop, but i still think that there must me another way of testing links than clicking them and recording one by one.
views:
59answers:
4What exactly is "Testing links"?
If it means they lead to non-4xx URIs, I'm afraid You must visit them.
As for existence of given links (like "Contact"), You may look for them using xpath.
Though the tool is in Perl, have you checked out linklint? It's a tool which should fit your needs exactly. It will parse links in an HTML doc and will tell you when they are broken.
If you're trying to automate this from a Python script, you'd need to run it as a subprocess and get the results, but I think it would get you what you're looking for.
I would just use standard shell commands for this:
- You can use wget to detect broken links
- If you use wget to download the pages, you can
then scan the resulting files with
grep --files-without-match
to find those that don't have a contact link.
If you're on windows, you can install cygwin or install the win32 ports of these tools.
You could (as yet another alternative), use BeautifulSoup to parse the links on your page and try to retrieve them via urllib2.