Im trying to use the md5 algorithm on web pages to avoid seeing duplicates. Is there an easy way to convert the result from beautifulsoup into a string which is digestible by md5?
Many thanks
Im trying to use the md5 algorithm on web pages to avoid seeing duplicates. Is there an easy way to convert the result from beautifulsoup into a string which is digestible by md5?
Many thanks
Just turn it into a string with str
:
from BeautifulSoup import BeautifulSoup
doc = "<html><h1>Heading</h1><p>Text"
soup = BeautifulSoup(doc)
str(soup)
(from the docs)