I'm building an app in python, and I need to get the URL of all links in one webpage. I already have a function that uses urllib to download the html file from the web, and transform it to a list of strings with readlines().
Currently I have this code that uses regex (I'm not very good at it) to search for links in every line:
for line in lines:
result = re.match ('/href="(.*)"/iU', line)
print result
This is not working, as it only prints "None" for every line in the file, but I'm sure that at least there are 3 links on the file I'm opening.
Can someone give me a hint on this?
Thanks in advance