ansaurus

Question

Answer 1

+3 A:

it's because the lyrics are loaded by Javascript and the 'normal' method doesn't execute Javascript when you try to scrape the page.

Seems like you're out of luck unfortunately, unless you manage to execute the Javascript-method found in the source:

<body onload="javascript:getContent('aerosmith', 'crazy', '1281384888', '0475352e376cf1c3906afd8ec1b8ac70')">

Which I'm pretty sure you wont be able to, since it's probably put there to prevent just that.. :)

Yngve B. Nilsen 2010-08-09 20:16:33

Seems like this site has a good job of having terrible SEO!

strager 2010-08-09 20:19:18

Hehe, you can say that :)

Yngve B. Nilsen 2010-08-09 20:27:17

Answer 2

+1 A:

If you really want to do this, it is possible. You will need to control something like Gecko (using e.g. pywebkigtk) to open the web page up in a full browser that can execute JS, and then get the source code from that once it's finished rendering.

However, you won't be able to do it with any less than that. If you look at the Javascript source, you'll see that it just makes an AJAX POST request to content.php:

var url = "content.php?artist=" + artist + "&title=" + title + "&time=" + time + "&check=" + check;

with check, probably a hashed session ID. This is undoubtedly there to stop people doing exactly what you are doing.

katrielalex 2010-08-09 21:12:29

Answer 3

A:

if you're on Windows, you could use PAMIE to drive a browser....

Chris Curvey 2010-08-10 01:56:56

ansaurus

tags:

views:

answers:

How do I get this page programatically?

related questions