I have a collection of ebooks in different formats (e.g pdf, lit, chm, and other), I would like to extract the first page of each book and have it in plain text. What would be the best language to do so?
A portable language between Linux and XP would be a big plus.
My prime candidates at the moments are Java and Ruby, mostly because they are portable and have a large collection of available components to process different file formats, but are these languages the best choices ?