views:

43

answers:

3

In php how would I grab all javascript from a page given it's url? Is there a good regular expression to get the src of all javascript script tags or the script inside of them?

+1  A: 

You can use PHP Simple HTML DOM to traverse the DOM for <script> tags. You can grab inline scripts directly in a string and get the src attribute for externally linked scripts and download them directly with curl or something. It would require some coding, I don't know if there is a 'magic' script that would do that automatically for you.

stagas
Don't forget `javascript:` URIs and intrinsic event handler attributes (such as `onclick`).
David Dorward
would php simple html dom work even if the site doesn't have a valid dom structure, (like most sites)?
QuinnBaetz
@QuinnBaetz: I guess it would in most cases. You need to try it to be certain, but where I've used it I didn't ran into any issues. It's very easy to use also.
stagas
A: 

This should place the values of all src attributes contained in script tags into an array in the variable $matches. Check out the documentation for the format of the the array, as there is another parameter that will allow you to modify it.

preg_match_all('/<script[^>]*src=[\'"]([^\'"])+[\'"]/', $string, $matches);
GApple
it's generally a bad idea to parse html with regexp...
Javier Constanzo
A: 

I would suggest htmlSQL.

http://www.jonasjohn.de/lab/htmlsql.htm

With that you can get the code with tags as well as inline javascript for onclick like events also.

Thanashyam