I'm looking for a way to simulate browser resources expansion behavior.
The flow I'm trying to address is the following:
- Access an initial URL (e.g. http://example.dmn/index.htm)
- Parse the html response received (e.g. index.htm)
- Find the resources that a browser will fetch as a result of the index parsing, e.g.:
- Images
- Flash
- Embedded videos/audio
- Frames /iFrames
- Repeat the process recursively for each new resource found
I'm not expecting to follow links (href), only page resources that will be fetched automatically by a browser when the page is first accessed.
Do you have a suggestion how to preform this simulation?
Are there any Python projects/libraries that may help ?
Thanks