How can I use BeautifulSoup to find all the links in a page pointing to a specific domain?
views:
303answers:
1
                +3 
                A: 
                
                
              Use SoupStrainer,
from BeautifulSoup import BeautifulSoup, SoupStrainer
import re
# Find all links
links = SoupStrainer('a')
[tag for tag in BeautifulSoup(doc, parseOnlyThese=links)]
linkstodomain = SoupStrainer('a', href=re.compile('example.com/'))
Edit: Modified example from official doc.
                  viksit
                   2010-01-28 00:23:30
                
              I would be more selective with the regex; that one could result in false positives.
                  Ignacio Vazquez-Abrams
                   2010-01-28 05:07:33
                @Ignacio - right, this example has that caveat - the regex should obviously be as detailed as possible so as to avoid those false positives.
                  viksit
                   2010-01-28 07:57:39