Of course. The easiest way would be the Web::Scraper module. What it does is it lets you define scraper objects that consist of
- hash key names,
- XPath expressions that locate elements of interest,
- and code to extract bits of data from them.
Scraper objects take a URL and return a hash of the extracted data. The extractor code for each key can itself be another scraper object, if necessary, so that you can define how to scrape repeated compound page elements: provide the XPath to find the compound element in an outer scraper, then provide a bunch more XPaths to pull out its individual bits in an inner scraper. The result is then automatically a nested data structure.
In short, you can very elegantly suck data from all over a page into a Perl data structure. In doing so, the full power of XPath + Perl is available for use against any page. Since the page is parsed with HTML::TreeBuilder, it does not matter how nasty a tagsoup it is. The resulting scraper scripts are much easier to maintain and far more tolerant of minor markup variations than regex-based scrapers.
Bad news: as yet, its documentation is almost non-existent, so you have to get by with googling for something like [miyagawa web::scraper] to find example scripts posted by the module’s author.