tags:

views:

78

answers:

4

I was asked to build a control-system for a Ebay-like Finnish auction-site huuto.net. The system would reopen closed auctions by a specific rules. It would be completely external from the main site, running at an external website.

The site is however unwilling to release its API and Schema. I know no way to build such a system without knowing its API.

How do you build an internet site without its API and Schema?

+1  A: 

It might be possible to get the data you need by screen scraping the site. You could perform the operations you want to do by POSTing data into their forms or using a WebClient type API to make your program act like a web browser but that's likely to be an extremely brittle solution.

Honestly though, without an API, there really is no good solution.

Eric Petroelje
+2  A: 

You could try some form of automatic browsing: mechanize

Edit: Examples here.

wuub
Is Mechanize a browser which loads internet pages similarly as Firefox to temp and then applies wanted changes to the internet page
Masi
@Masi: Mechanize allows your program to behave exactly like a normal browser, fill out forms, click links, store cookies. For the website it looks just like another user clicking, when in fact it's your script performing tedious task of reopening auctions :)
wuub
A: 

you either need access to the database or an API, otherwise no point in even trying.

Virat Kadaru
+1  A: 

I think you're asking about building a site that interacts with another site without using a well-defined API. Is that right?

You can interact with an external site without using an official API - in order to do so, you need to imitate a normal site visitor and send your requests to the site frontend (in much the same way as a web crawler does). Tools like hpricot, mechanize and curl can help you parse the content of pages and send requests, but in doing so your system may be quite brittle. Any change to the target site might mean you have to rewrite portions of your system.

Mr. Matt
@Matt: Thank you for pointing out the programs! --- I like Python. It may be possible that I could use Beautiful Soup and Defaultdict for this problem too. --- Your last point is good. I do not want to build a solution which gets continuously broken. Therefore, I am trying to find tools which allow me to rapid updates.
Masi