tags:

views:

57

answers:

2

Hey,

I was just wondering if anyone could explain to me how a price comparison website works? I mean if it works how I think it does (explained below) then I have a really good project to work on for myself, which should make a lot of money long term.

So how I think it works is that each supplier provides an RSS feed which the comparison website connects to. Each product has its own unique code (provided by the manufacturer) and this is supplied in a standard tag in the RSS feed. The comparison website collects together data about who supplies which product and compares the ones that do supply a specific product. For example, book publishers may provide RSS feeds containing the blurb on the back of each book, the ISBN, publishers name, authors, title (obviously), RRP, genre, fact / fiction and some other useful information, all in tags such as <ISBN>ISBN here</ISBN>. Comparison websites which sell books may filter these books by genre, author etc and display them how they want using an xsl stylesheet (the RSS feed being provided in XML).

Is this correct, or do I have completely the wrong idea of what RSS is or how it works? What other ways are there to achieve the same thing?

Many thanks in advance.

Regards,

Richard

PS if this is tagged wrongly and should be moved to another section then please could someone retag it. I tried to find price comparison websites or something similar but theres no such tag. I have put the most relevant I could think of.

+5  A: 

Your concept seems more like the ideal world situation. In reality the site uses a crawler like any other search engine. The site just optimizes the crawler to scrape prices and product information. For example they know that the price for an item on Dell's site is in a div with id "foo" and the product info is in the 3rd p tag after the text "Buy now".

They more than likely also limit what sites they crawl to known retailers.

speshak
Surely its a lot of work, in the way you described, to keep an eye on websites for layout changes (for example if "foo" was changed to "bar"). I am thinking of starting up my own website which utilises feeds from both the manufacturer and supplier (I wont say any more than that, for confidentiality purposes, but it would basically allow viewers to choose products based on data from suppliers and then give them a choice of retailers to buy from). This wont be for another year or so though. Richard
ClarkeyBoy
Yes, it is a lot of work … but it doesn't require the cooperation of the sites being crawled
David Dorward
It is a lot of work, and the quality of the price comparison site depends a lot on how good their scraping engine is.
speshak
Is it not possible to simply load the page into a string, then use ajax to find tags through scripts such as $("div.foo")? That, I guess, would be a fairly easy way to do this compared to using spiders (which I know nothing about), wouldnt it?
ClarkeyBoy
That's basically what a spider is going to do. Download the html and then locate the html elements using whatever construct the language has available. What you write it in isn't important, though I'd lean towards a language with good string manipulation libraries like Perl or Python.
speshak
I just came up with another idea - it wouldnt be a selling site, so price comparison is really not applicable here. Therefore it would be in the retailers interests to tell the world that they supply whatever products they happen to supply - they have nothing to lose out of it. All they would have to do is keep a feed up to date of what products they sell (including their part numbers or whatever uniquely identifies them). This way I would be retrieving feeds from both the manufacturers and retailers. No spiders or anything involved.
ClarkeyBoy
+1  A: 

As speshak said, they generally use a crawler and parse the HTML.

The reason the store sites don't provide the RSS feed, is because they hate comparison sites. It focuses the competition strictly on price. If 20 stores are monitored, that means the store has a 19 in 20 chance of not being the lowest. Stores would rather you visit their sites directly, where they have a chance on convincing you to pay their prices based on service.

James Curran
hmm may be best to stick to feeds for manufacturers (who wouldnt sell retail and would have an interest in getting their product out there) then scour retailers sites for prices.. thanks for the input.
ClarkeyBoy