Is there a reliable way to extract text from PDF? The first thought that comes to mind is that PDF may have multiple columns and the extraction mechanism needs to know the logical structure somehow. I understand that some PDF docs are "tagged" but I'd need to support pretty much any PDF document.
Any third party components to the rescue here?