tags:

views:

20

answers:

2

How can I do this:

Enter on this website (http://www.samair.ru/proxy/time-01.htm) and get the list of the ip address and turn it to a dictionary variable?

whit these code in particular, I only can get the first ip of the website

ip = urllib.urlopen('http://www.samair.ru/proxy/time-01.htm').read()
clientIp = re.search("(\d+\.\d+\.\d+\.\d+)", ip).group()

print clientIp
+1  A: 

Use findall instead of search:

ip = urllib.urlopen('http://www.samair.ru/proxy/time-01.htm').read()
clientIp = re.findall(r"\d+\.\d+\.\d+\.\d+", ip)

Note the “raw” string r"…" that prevents interpretation of the backslashes as escape character.

This gives you a list of strings containing the IP addresses. To turn it into a dictionary you need key–value pairs.

Philipp
how can I do it with the ports too? I'm trying to add ':\d+\' to the the port, but it's turning the 'ip' variable to string
Shady
The page uses JavaScript to hide the port number, you have to “decode” it first. For example, one of the line reads `<td>75.21.47.2<script type="text/javascript">document.write(":"+t+z+t+z)</script></td>`, with constants `t` and `z` defined somewhere. This is much harder and cannot be done with a simple regular expression search; please post it as a new question.
Philipp
+1  A: 

Use re.findall() instead of re.search()

Fabian