Hi All,
We have a requirement to cache web pages as accurately as possible, so that we can go back and view a version of a page at any previous point in time. We'd like to be able to view the page as it really was - with the right css, javascript, images etc.
Are there any OS libraries (any language) that will fetch a page, download all externally-linked assets and re-write the links such they they point to the locally-cached assets?
Or is this a case of rolling our own?
Thanks
Edit: I realise that without rendering dynamically generated links etc that this is not going to be 100% possible unless we do DOM rendering. However for the time being we can probably live without this.