Hi,
I have been using pdfbox for extracting text information from PDFs. I have succesfully parsed all properties of text such as fontname , fontface , size ,position etc.
PROBLEM: I am using pdfbox1.2.1(latest version). The getCharacter() in TextPosition class returns the full string except the last character. The last character is parsed as a separate string.
Ex: "How are you" is parsed as "How are yo" and "u" (2 separate strings).
I dont want it to happen that way..
Has anybody come accross this? .. Am i doing something wrong??.. Waiting for reply..
Thanks and Regards, Magggi