views:

42

answers:

1

Im trying to use the md5 algorithm on web pages to avoid seeing duplicates. Is there an easy way to convert the result from beautifulsoup into a string which is digestible by md5?

Many thanks

+4  A: 

Just turn it into a string with str:

from BeautifulSoup import BeautifulSoup
doc = "<html><h1>Heading</h1><p>Text"
soup = BeautifulSoup(doc)

str(soup)

(from the docs)

Ned Batchelder