tags:

views:

40

answers:

1

hi

i want to know how to receive the string from file in java...

that file have different language letters...

i used UTF-8 format... this can receive some language letters correctly...

but Latin letters cant display correctly...

so how can i receive all language letters...

or any other format for receive all language letters...

the code is

URL url = new URL("http://google.cm");

URLConnection urlc = url.openConnection();
BufferedReader buffer = new BufferedReader(new InputStreamReader(urlc.getInputStream(), "UTF-8")); 
StringBuilder builder = new StringBuilder(); 
int byteRead; 
while ((byteRead = buffer.read()) != -1)
{ 
 builder.append((char) byteRead);
 } 

buffer.close();

text=builder.toString();

if i display the "text" the letters cant display correctly

thanks and advance

+2  A: 

Reading a UTF-8 file is fairly simple in Java:

Reader r = new InputStreamReader(new FileInputStream(filename), "UTF-8"); 

If that isn't working, the issue lies elsewhere.

EDIT: According to iconv, Google Cameroon is serving invalid UTF-8. It seems to actually be iso-8859-1.

EDIT2: Actually, I was wrong. It serves (and declares) valid UTF-8 if the user agent contains "Mozilla/5.0" (or higher), but valid iso-8859-1 in (some) other cases. Obviously, the best bet is to use getContentType to check before decoding.

Matthew Flaschen