String s = parseText(filename, position)
Where is this method defined? I'm guessing that it's your own method, which opens the file and extracts a particular chunk of the data. Somewhere in this process it's getting converted from bytes to characters, probably using the default encoding for your JVM.
If the default encoding of your running JVM doesn't match the actual encoding in the file, you're going to get incorrect characters in your string. Added to that, if you're reading content that is encoded in a multi-byte form (such as UTF-8), your "position" may point into the middle of a multi-byte encoding.
If the source files are in well-formed XML, you'll be much better off using a real parser (such as the one built into the JDK) to parse them, since the parser will provide the correct translation of bytes to characters. Then use an XPath expression to retrieve the values.
If you haven't used an XML parser in the past, here are two documents that I wrote on parsing and XPath.
Edit: one thing that you may find helpful is to print out the actual character values in the string, using something like the following:
public static void main(String[] argv) throws Exception
{
String s = "testing\u20ac";
for (int ii = 0 ; ii < s.length() ; ii++)
{
System.out.println(ii + ": " + (int)s.charAt(ii) + " = " + s.charAt(ii));
}
}
You should probably also print out your default character set, so that you know how any particular sequence of bytes is translated to characters:
public static void main(String[] argv) throws Exception
{
System.out.println(Charset.defaultCharset());
}
And finally, you should examine the served page as raw bytes, to see exactly what's being returned to the client.
Edit #2: the character ò is Unicode value 00F2, which would be UTF-8 encoded as C3 B2. These two codes doesn't correspond to the symbols that you showed in your earlier answer.
For more on Unicode characters, see the code charts at Unicode.org.