tags:

views:

113

answers:

4

Hi, everyone,

I want to ask a question about Java. I have use the URLConnection in Java to retrieve the DataInputStream. and I want to convert the DataInputStream into a String variable in Java. What should I do? Can anyone help me. thank you.

The following is my code:

URL data = new URL("http://google.com");
URLConnection dataConnection = data.openConnection();
DataInputStream dis = new DataInputStream(dataConnection.getInputStream());
String data_string;
// convent the DataInputStream to the String
+5  A: 

You can use commons-io IOUtils.toString(dataConnection.getInputStream(), encoding) in order to achieve your goal.

DataInputStream is not used for what you want - i.e. you want to read the content of a website as String.

Bozho
This does not take into account the content encoding for the URL you are accessing. You should use the two argument version of the `IOUtils.toString` method in order to explicitly specify the encoding.
Grodriguez
@Grodriguez or use an `InputStreamReader`. I added the encoding, a good practice indeed.
Bozho
Even if you pass an `InputStreamReader` instead, you still need to specify the encoding when the `InputStreamReader` is created, otherwise you will have the same problem (the default platform encoding would be used, which may or may not match the encoding of the URL content).
Grodriguez
@Grodriguez that's what I meant by the `InputStreamReader` suggestion. (Btw the downvote can be removed, I guess)
Bozho
A: 

Try this - In code "dis" is your dataInputStream

StringBuffer sb = new StringBuffer();
try{
    String line = null;
    while((line=dis.readLine()) != null){
        sb.append(line+"\n");
    }
}catch(Exception ex){
    ex.getMessage();
}

return sb.toString();
Sachin Shanbhag
`DataInputStream.readLine()` is deprecated.
Grodriguez
+4  A: 

If you want to read data from a generic URL (such as www.google.com), you probably don't want to use a DataInputStream at all. Instead, create a BufferedReader and read line by line with the readLine() method. Use the URLConnection.getContentEncoding() field to find out the content encoding (you will need this in order to create your reader properly).

Example:

URL data = new URL("http://google.com");
URLConnection dataConnection = data.openConnection();

// Find out content encoding, default to ISO-8859-1 if unknown
String contentEncoding = dataConnection.getContentEncoding();
if (contentEncoding == null) {
    contentEncoding = "ISO-8859-1";
}

// Create reader and read string data
BufferedReader r = new BufferedReader(
        new InputStreamReader(dataConnection.getInputStream(), encoding));
String content = "";
String line;
while ((line = r.readLine()) != null) {
    content += line + "\n";
}
Grodriguez
+1 Nice one Thanks for letting me know.
org.life.java
+2  A: 
import java.net.*;
import java.io.*;

class ConnectionTest {
    public static void main(String[] args) {
        try {
            URL google = new URL("http://www.google.com/");
            URLConnection googleConnection = google.openConnection();
            DataInputStream dis = new DataInputStream(googleConnection.getInputStream());
            StringBuffer inputLine = new StringBuffer();
            String tmp; 
            while ((tmp = dis.readLine()) != null) {
                inputLine.append(tmp);
                System.out.println(tmp);
            }
            //use inputLine.toString(); here it would have whole source
            dis.close();
        } catch (MalformedURLException me) {
            System.out.println("MalformedURLException: " + me);
        } catch (IOException ioe) {
            System.out.println("IOException: " + ioe);
        }
    }
}  

This is what you want.

org.life.java
@org.life.java, thank you for your answer. And i think there is some misunderstand of the problem. After the 'System.out.println(inputLine);', the inputLine become 'null' value and I want the inputLine="<html><head..." and use in other class future. So, would you mind to give me another suggestion? thank you.
Questions
@Questions updated the code
org.life.java
@org.life.java, a great great great help. thank you very much and sorry to lose your time.
Questions
I don't believe this can work. `readUTF()` expects string data to be stored in a specific way (see http://download.oracle.com/javase/1.3/docs/api/java/io/DataInput.html#readUTF%28%29). This will not be the case if you try to read content from an arbitrary URL.
Grodriguez
@Grodriguez Thanks foe letting me know that. I have altered it back to readLine, I know its depricated .other solution are already here like bozho's
org.life.java
If you use `DataInputStream.readLine()`, your solution will not work correctly if the content encoding of the URL you are accessing is anything different than plain ASCII. This is why the `readLine` method is deprecated. See my answer to this same question for a way to read the contents of the URL taking into account the content encoding, without resorting to any external libraries.
Grodriguez
Isn't the readLine method deprecated?
SidCool
@SidCool already mentioned that.
org.life.java
Yes, didn't read your entire comment...
SidCool