I'm test building a scraping site with django. For some reason the following code is only providing one picture image where i'd like it to print every image, every link, and every price, any help? (also, if you guys know how to place this data into a database model so I don't have to always scrape the site, i'm all ears but that may be another question) Cheers!
Here is the template file:
{% extends "base.html" %}
{% block title %}Boats{% endblock %}
{% block content %}
<img src="{{ fetch_boats }}"/>
{% endblock %}
Here is the views.py file:
#views.py
from django.shortcuts import render_to_response
from django.template.loader import get_template
from django.template import Context
from django.http import Http404, HttpResponse
from fetch_images import fetch_imagery
def fetch_it(request):
fi = fetch_imagery()
return render_to_response('fetch_image.html', {'fetch_boats' : fi})
Here is the fetch_images module:
#fetch_images.py
from BeautifulSoup import BeautifulSoup
import re
import urllib2
def fetch_imagery():
response = urllib2.urlopen("http://www.boattrader.com/search-results/Type")
html = response.read()
#create a beautiful soup object
soup = BeautifulSoup(html)
#all boat images have attribute height=165
images = soup.findAll("img",height="165")
for image in images:
return image['src'] #print th url of the image only
# all links to detailed boat information have class lfloat
links = soup.findAll("a", {"class" : "lfloat"})
for link in links:
return link['href']
#print link.string
# all prices are spans and have the class rfloat
prices = soup.findAll("span", { "class" : "rfloat" })
for price in prices:
return price
#print price.string
Lastly, if needed the mapped url in urlconf is below:
from django.conf.urls.defaults import *
from mysite.views import fetch_it
urlpatterns = patterns('', ('^fetch_image/$', fetch_it))