tags:

views:

29

answers:

1

I am trying to download content from a content provider that charges me every time I access a document. The code I have written correctly downloads the content and saves them in a local file but apparently it requests the file twice and I am being double charged. I'm not sure where the file is being requested twice, here is my code:

    password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()        
    # Add the username and password.
    password_mgr.add_password(None, top_level_url, username, password)        
    handler = urllib2.HTTPBasicAuthHandler(password_mgr)        
    # create "opener" (OpenerDirector instance)
    opener = urllib2.build_opener(handler)        
    # use the opener to fetch a URL
    file_stream = opener.open(url)        

    # Open our local file for writing
    local_file = open(directory + doc_name, "w+")
    #Write to our local file
    local_file.write(file_stream.read())

I need to figure out how to read the content while only requesting the document once. Any help would be greatly appreciated.

+1  A: 

Could it be that it requests the file twice, but only downloads it once? The first request would be a normal GET (without an "Authorization" header), followed by a response of HTTP 401 (Authorization Required), followed by the same request with the Authorization header.

If thats the case, you shold talk to your content provider, since you accessed it only once.

knitti