I'm trying to download a static mirror of a wiki using wget. I only want the latest version of each article (not the full history or diffs between versions). It would be easy to just download the whole thing and delete unnecessary pages later, but doing so would take too much time and place an unnecessary strain on the server.
There are a number of pages I clearly don't need such as:
WhoIsDoingWhat?action=diff&date=1184177979
Is there a way to tell wget not to download and recurse on URLs that have 'action=diff' in them? Or otherwise exclude URLs that match some regex?