views:

777

answers:

5

Hello. Is it possible to fetch pages with urllib2 through a SOCKS proxy on a one socks server per opener basic? I've seen the solution using setdefaultproxy method, but I need to have different socks in different openers.

So there is SocksiPy library, which works great, but it has to be used this way:

import socks
import socket
socket.socket = socks.socksocket
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)

That is, it sets the same proxy for ALL urllib2 requests. How can I have different proxies for different openers?

A: 

== EDIT == (old HTTP-Proxy example was here..)

My fault.. urllib2 has no builtin support for SOCKS proxying..

There are some 'hacks' adding SOCKS to urllib2 (or the socket object in general) here.
But I hardly suspect that this will work with multiple proxies like you require it.

As long as you don't wan't to hook / subclass urllib2.ProxyHandler I would suggest to go with pycurl.

Shirkrin
It ain't working. urllib2.URLError: <urlopen error [Errno 10054] An existing connection was forcibly closed by the remote host>. The proxy is working fine (so it's not its problem)
roddik
Strange, in my tests (I'm behind a http proxy) it works fine.Did you try multiple simultanous connections?
Shirkrin
No, just your snippet without authentication. Are you sure we both are talking about SOCKS proxies?
roddik
A: 

You have only one socket for all openers and implementing socks is in socket level. So, you can't.
I suggest you to use pycurl library, it much more flexible.

Andrew
is an easy way to use pycurl with 2.6 on windows?
roddik
nope, looks like project is dead (last update was 2 years ago)and it doesn't compile on windows with new curl
Andrew
+1  A: 

Try with pycurl:

import pycurl
c1 = pycurl.Curl()
c1.setopt(pycurl.URL, 'http://www.google.com')
c1.setopt(pycurl.PROXY, 'localhost')
c1.setopt(pycurl.PROXYPORT, 8080)
c1.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c2 = pycurl.Curl()
c2.setopt(pycurl.URL, 'http://www.yahoo.com')
c2.setopt(pycurl.PROXY, 'localhost')
c2.setopt(pycurl.PROXYPORT, 8081)
c2.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5)

c1.perform() 
c2.perform() 
systempuntoout
A: 

You might be able to use threading locks if there aren't too many connections being made at once, and you need to access from multiple threads:

import socks
import socket
import thread
lock = thread.allocate_lock()
socket.socket = socks.socksocket

def GetConn():
    lock.acquire()
    import urllib2
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "x.x.x.x", y)
    conn = urllib2.urlopen(ARGUMENTS HERE)
    lock.release()
    return conn

You might also be able to use something like this every time you need to get a connection:

urllib2 = execfile('urllib2.py')
urllib2.socket = dummy_class() # dummy_class needs the socket module's methods

These are obviously not fantastic solutions, but I've put in my 2¢ anyway :-)

David Morrissey
+1  A: 

You could do you it by setting evironmental variable HTTP_PROXY in following format:

user:pass@proxy:port

or if you use bat/cmd, add before calling script:

set HTTP_PROXY=user:pass@proxy:port

I am using such cmd-file to make easy_install work under proxy.

Dmitry Kochkin