tags:

views:

756

answers:

9

Hi,

I've tried reading a JPG file using the StreamReader class' ReadToEnd() method which returns a string.

For some reason though, when I write this string out to a file, it doesn't open.

Is something lost when reading data into a string? What is it really good for?

If I use a byte array and read into that then its fine.

A: 

You just can't do it this way.... Use FileStream instead.

You cant use string to read binary files, some characters won't make its way as far as I know.

goodwill
+2  A: 

String is designed for holding unicode characters; not binary. For binary, use a byte[] or Stream. Or an Image etc for more specialized image handling.

Despite the name, StreamReader is actually a specialized TextReader - i.e. it is a TextReader that reads from a Stream. Images are not text, so this isn't the right option.

Marc Gravell
+1  A: 

Unfortunately there is a serious problem with class names in System.IO namespace. StreamReader is designed to read\write from\to text files. You should use FileStream for binary files as @goodwill suggested

aku
+19  A: 

Strings are for text data. They're not for binary data - if you use them this way you will lose data (there are encodings you can use which won't lose data if you're lucky, but there are subtle issues which still make it a really bad idea.)

If you're actually dealing with a file, the easiest way of reading the whole thing is to call File.ReadAllBytes. If you have to deal with an arbitrary stream, have a look at "Creating a byte array from a stream".

Jon Skeet
A: 

Strings are used to represent text. They are good at representing text. Very good, in fact, as they support Unicode and protect you from all sorts of typical string processing bugs.

They aren't good at representing binary data, because that isn't what they're designed for. As you mention, a byte array is much better for this.

It's not a matter of one being better than the other, it's simply fitness for purpose and understanding when to choose one or the other. Text = string, binary = byte array or stream.

Greg Beech
A: 

Im actually reading from a network connection and as far as im aware, there is not ReadToEnd() for binary data.

Sir Psycho
Please have a look at Jon Skeets link, it shows how to read the bytes off a network stream.
Mark S. Rasmussen
You could write the number of bytes before going into streaming the data.
Osama ALASSIRY
+3  A: 

As all Real Programmers know, the only useful data structure is the Array. Strings, Lists, Structures, Sets-- these are all special cases of arrays and can be treated that way just as easily without messing up your programming language with all sorts of complications. The worst thing about fancy data types is that you have to declare them, and Real Programming Languages, as we all know, have implicit typing based on the first letter of the (six character) variable name.

Besides, the determined Real Programmer can write Fortran programs in any language.


Whoever modded this down clearly has either no sense of humour or no knowledge of folklore. The above is excerpted from a very famous 1983 letter to the editor of Datamation, by Ed Post of Tektronix. The letter is titled Real Programmers Don't Use Pascal.

Peter Wone
A: 

Always remember, text data is binary data but binary data is not text data.

jussij
+1  A: 

I notice that nobody has answered the actual questions.

Is something lost when reading data into a string?

A JPEG file contains a picture rather than words. This bicture has a binary representation as a sequence of bytes. Some of those bytes have the value 0x00 also represented as NUL. In a string, a byte containing with this value is interpreted as marking the end of string. Data past the end of string is treated as unused buffer and is ignored.

When you write the string out to a file, nothing past the first NUL is included. As a result, the file is not a complete binary image and is rejected by the validation logic of software trying to interpret it as a JPEG.

So data generally is lost when you load a string with non-textual data. The problem here is that you have effectively made an invalid typecast, but neither compiler nor the runtime has stopped you, and the result is data corruption.

What is it really good for?

Several thing. As others have said, strings are designed to contain text. In .NET, strings support encodings other than plain old ASCII. There is also extensive support for text manipulation. Look up format specifiers in help for a spectacular example of string manipulation.

Why do C# strings use NUL for end of string?

This is a legacy thing. NUL isn't much good for anything else and doing so simplifies marshalling strings in and out of managed code. BSTR does the same thing for the same reasons.

Peter Wone
I realize that in C/C++ that the NUL character (0x00) is the end of a string. However, in .Net, strings are much more complex objects. Is there any reason why a string could not contain a NUL character, other than specific implementation details.
Kibbee
@Kibbee interop maybe?
Camilo Martin