views:

204

answers:

2

I have a program that fetches info from other pages and parses them using BeautifulSoup and Twisted's getPage. Later on in the program I print info that the deferred process creates. Currently my program tries to print it before the differed returns the info. How can I make it wait?

def twisAmaz(contents): #This parses the page (amazon api xml file)
 stonesoup = BeautifulStoneSoup(contents)
 if stonesoup.find("mediumimage") == None:
  imageurl.append("/images/notfound.png")
 else:
  imageurl.append(stonesoup.find("mediumimage").url.contents[0])
 usedPdata = stonesoup.find("lowestusedprice")
 newPdata = stonesoup.find("lowestnewprice")
 titledata = stonesoup.find("title")
 reviewdata = stonesoup.find("editorialreview")
 if stonesoup.find("asin") != None:
  asin.append(stonesoup.find("asin").contents[0])
 else:
  asin.append("None")
 reactor.stop()


deferred = dict()
for tmpISBN in isbn:  #Go through ISBN numbers and get Amazon API information for each
 deferred[(tmpISBN)] = getPage(fetchInfo(tmpISBN))
 deferred[(tmpISBN)].addCallback(twisAmaz)
 reactor.run()

.....print info on each ISBN
+2  A: 

First, you shouldn't put a reactor.stop() in your deferred method, as it kills everything.

Now, in Twisted, "Waiting" is not allowed. To print results of you callback, just add another callback after the first one.

Luc Stepniewski
Thanks, Luc! May I ask where the reactor.stop() should go?
Jody S
When I said to not put a reactor.stop(), I meant to not put it in that first deferred code, as it would stop everything.So yo should put it in the last deferred (the one that print the results) where you're sure you want to stop your program.Just a note: you should use addCallbacks(method1,error_method) to catch to potential errors.
Luc Stepniewski
Look at the tutorial about deferred on http://twistedmatrix.com/documents/current/core/howto/deferredindepth.html, especially the section named 'Callbacks can return deferreds'.
Luc Stepniewski
Okay, thanks! And the only issue I have now is that I'm accessing several sites and trying to print the data in a specific order, so if I have a function for each site then it might print them out of order...
Jody S
+2  A: 

What it seems like is you're trying to make/run multiple reactors. Everything gets attached to the same reactor. Here's how to use a DeferredList to wait for all of your callbacks to finish.

Also note that twisAmaz returns a value. That value is passed through the callbacks DeferredList and comes out as value. Since a DeferredList keeps the order of the things that are put into it, you can cross-reference the index of the results with the index of your ISBNs.

from twisted.internet import defer

def twisAmaz(contents):
    stonesoup = BeautifulStoneSoup(contents)
    ret = {}
    if stonesoup.find("mediumimage") is None:
        ret['imageurl'] = "/images/notfound.png"
    else:
        ret['imageurl'] = stonesoup.find("mediumimage").url.contents[0]
    ret['usedPdata'] = stonesoup.find("lowestusedprice")
    ret['newPdata'] = stonesoup.find("lowestnewprice")
    ret['titledata'] = stonesoup.find("title")
    ret['reviewdata'] = stonesoup.find("editorialreview")
    if stonesoup.find("asin") is not None:
        ret['asin'] = stonesoup.find("asin").contents[0]
    else:
        ret['asin'] = 'None'
    return ret

callbacks = []
for tmpISBN in isbn:  #Go through ISBN numbers and get Amazon API information for each
    callbacks.append(getPage(fetchInfo(tmpISBN)).addCallback(twisAmazon))

def printResult(result):
    for e, (success, value) in enumerate(result):
        print ('[%r]:' % isbn[e]),
        if success:
            print 'Success:', value
        else:
            print 'Failure:', value.getErrorMessage()

callbacks = defer.DeferredList(callbacks)
callbacks.addCallback(printResult)

reactor.run()
Aaron Gallagher
Looks good, thanks Aaron!
Jody S