views:

313

answers:

1

Jon Skeet has the following reputation tracker which is built by C#.

I am interested in building a similar app by Python such that at least the following modules are used

  • beautiful soup
  • defaultdict

We apparently need

How can you build a similar reputation system as Jon's one by Python?

+3  A: 

The screenscraping is easy, if I understand the SO HTML format correctly, e.g., to get my rep (as I'm user 95810):

import urllib
import BeautifulSoup
page = urllib.urlopen('http://stackoverflow.com/users/95810')
soup = BeautifulSoup.BeautifulSoup(page)
therep = str(soup.find(text='Reputation').parent.previous.previous).strip()
print int(therep.replace(',',''))

I'm not sure what you want to do with defaultdict here, though -- what further processing do you desire to perform on this int, that would somehow require storing it in a defaultdict?

Alex Martelli
Thank you Alex! --- You are right that we do not need `defaultdict` here.
Masi
@ Why do you use two times `previous` in `--.parent.previous.previous--`? In other words, how do you know that you need two `previous`?
Masi
@Masi, because I player around with the DOM for that page and experimentally observed that the number is "one up and two previous" from the string 'Reputation' -- how else do you go about html scraping except empirically in such ways?
Alex Martelli
And obviously, such HTML scraping is fragile and prone to breaking anytime in the future. This is why using a solution such as a JSON API will be much more robust. If you insist on using BeautifulSoup, you will end up with this kind of solution. If you are open to other methods such as JSON, you can build a much better solution.
Greg Hewgill