views:

23

answers:

1

Hi everybody

I'm parsing html pages to get specific information, but there are some pages that I cant get all the information displayed on the web page, for example in this page

I cant get the reviews information. By the way, if you see the source code of the page there are very much empty lines, and the reviews information dont appear.

Do you know why? Some library to read this type of pages?

Thanks

+1  A: 

I'm willing to bet they are using some sort of javascript to load in the review information. In order to access that information, you are going to need to somehow either mimic the request or evaluate the javascript and then parse the resulting page. I would suggest examining their javascript and mimicking the request they use to download the review information as that will be much easier than attempting to evaluate the javascript in your code.

Chris Thompson