views:

112

answers:

3

I have a Perl script that uses LWP::UserAgent to download a webpage which it then processes using regular expressions. The problem is that portions of the webpage which are regular HTML aren't being returned to LWP::UserAgent since the site recognizes that the browser doesn't have Flash installed and instead returns HTML prompting us to download Flash instead of the appropriate HTML that we need to parse.

How can I make LWP::UserAgent appear to have flash installed to the web server we're requesting the page from? I'm using the following code to initialize LWP::UserAgent:

use LWP::UserAgent;
my $ua = LWP::UserAgent->new(cookie_jar => { },requests_redirectable => [ ]);
$ua->agent('Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:9.9.9.9) Gecko/20079999 Firefox/2.0.0.1');
$ua->timeout(10);

Thanks in advance for your help!

A: 

The site is probably testing if flash is installed using javascript. Often this test is client side only and there for should not affect the page. But maybe they are firing off an async request telling the server that flash is installed. To test for this you need to download TamperData, load up the TamperData window from the tools menu and refresh the page. The window will show you all requests that are being fired off and you can inspect them. If there is some request like http://whatever.com/flash_test.php?flash_installed=true , then you can replay this request using LWP.

Another option is to decompile the flash app. This is very easy to do and often you'll get full source including code comments, and here is a free trial.

Rook
FireBug would be good here too. The important thing is to capture the conversation(s) between the client and server so you can reproduce the relevant bits.
Ether
Yep, that seems to be the case: http://www.adobe.com/support/flash/how/shock/javaplugs/javaplugs02.html
Maxwell Troy Milton King
Yeah firebug is great, but it gives you less traffic information than tamperdata.
Rook
+1  A: 

I would recommend you to use Firebug for that - very nice and powerful addon for FF. I agree with Michael, that the server can learn such info about the client only from headers send to him, or from script, that runs on the client and can talk to server (JavaScript, Flex, ...). For JavaScript case you can try to disable it in FF temporary and try to reload the page: if server answers you the same as for LWP, then you know the answer.

dma_k
+1  A: 

Both @Michael & @dma_k were correct. The server was not checking if LWP::UserAgent had flash installed. Instead, for some reason the returned content was not being dumped correctly while we were trying to debug the script. Unfortunately we didn't figure out a way to fix this but after some trial and error we figured out how to pull out the appropriate fields from the page. Sorry that there isn't really a correct answer for this one.

Russell C.