Multithreaded Downloading Through Proxies In Python | ansaurus

tags:

views:

182

answers:

3

Q:

Multithreaded Downloading Through Proxies In Python

What would be the best library for multithreaded harvesting/downloading with multiple proxy support? I've looked at Tkinter, it looks good but there are so many, does anyone have a specific recommendation? Many thanks!

+1 A:

anthony 2009-10-20 20:28:57

Thanks, I'm taking a look now

Cookies 2009-10-20 20:39:28

A:

Is this something you can't just do by passing a URL to newly spawned threads and calling urllib2.urlopen in each one, or is there a more specific requirement?

Kylotan 2009-10-20 20:36:02

urllib2 isn't thread safe from what I've seen, but I could of just been doing it wrong because I'm a noob to threading. I am downloading a lot of files so I'd rather use something a bit more powerful than just urllib anyway

Cookies 2009-10-20 20:40:55

It's almost certain to be thread-safe unless you do something inherently dangerous like trying to access the same object from multiple threads.

Kylotan 2009-10-20 22:10:59

A:

Also take a look at http://scrapy.org/, which is a scraping framework built on top of twisted.

twneale 2009-10-20 21:24:04

Excellent, I don't see anything about proxy support but I think I could do that myself.

Cookies 2009-10-20 21:36:35

No. Support for HTTP proxies is not currently implemented in Scrapy, but it will be in the future. For more information about this, follow this ticket. Setting the http_proxy environment variable won’t work because Twisted (the library used by Scrapy to download pages) doesn’t support it. See this Twisted ticket for more info.

Cookies 2009-10-20 21:39:02

related questions

Programmatically talking to a Serial Port in OS X or Linux

Best ways to teach a beginner to program?

Calling a Function From a String With the Function's Name in Python

An executable Python app

Text Editor For Linux (Besides Vi)?

What Hosting Service is best for Django applications?

File size differences after copying a file to a server vía FTP

Python: what is the difference between (1,2,3) and [1,2,3], and when should I use each?

Python: What OS am I running on?

How do I make a menu in python that does not require the user to press (enter) to make a selection?

How do you express binary literals in Python?

What is the most efficient graph data structure in Python?

Adding a Method to an Existing Object

How to learn Python: Good Example Code?

How do I use Python's itertools.groupby()?

Python and MySQL

Class views in Django

Is there an IDE that provides code completion for Python

Using 'in' to match an attribute of Python objects in an array

cx_Oracle - what is the best way to iterate over a result set?

cx_Oracle - How do I access Oracle from Python?

Continuous Integration System for a Python Codebase

Get a preview jpeg of a pdf on windows?

How can I find the full path to a font from its display name on a Mac?

XML Processing in Python