What is the difference between plaintext and binary data?

views:

358

answers:

What is the difference between plaintext and binary data?

Many languages have functions which only process "plaintext", not binary. Does this mean that only characters within the ASCII range will be allowed?

Binary is just a series of bytes, isn't it similar to plaintext which is just a series of bytes interpreted as characters? So, can plaintext store the same data formats / protocols as binary?

+3 A:

a plain text is human readable, a binary file is usually unreadable by a human, since it's composed of printable and non-printable characters.

Try to open a jpeg file with a text editor (e.g. notepad or vim) and you'll understand what I mean.

A binary file is usually constructed in a way that optimizes speed, since no parsing is needed. A plain text file is editable by hand, a binary file not.

klez 2009-09-16 19:06:56

Jon Skeet can read binary files.

Rob Hruska 2009-09-16 19:07:17

I hope this is the dawn of Chuck Norris-style Jon Skeet jokes.

JMP 2009-09-16 19:09:51

Chuck Norris can read binary files right off the platter.

fbrereto 2009-09-16 19:10:12

Good answer, but I was referring to the programming context, can functions that only accept plaintext, store the same data formats / protocols as functions that accept binary?

Jenko 2009-09-16 19:10:33

the problem is that binary files don't have newlines, so it's just difficult, but not impossible.

klez 2009-09-16 19:11:44

@presleyster and fbrereto: see http://meta.stackoverflow.com/questions/9134/jon-skeet-facts

T.E.D. 2009-09-16 19:16:29

+2 A:

"Plaintext" can have several meanings.

The one most useful in this context is that it is merely a binary files which is organized in byte sequences that a particular computers system can translate into a finite set of what it considers "text" characters.

A second meaning, somewhat connected, is a restriction that said system should display these "text characters" as symbols readable by a human as members of a recognizable alphabet. Often, the unwritten implication is that the translation mechanism is ASCII.

A third, even more restrictive meaning, is that this system must be a "simple" text editor/viewer. Usually implying ASCII encoding. But, really, there is VERY little difference between you, the human, reading text encoded in some funky format and displayed by a proprietary program, vs. VI text editor reading ASCII encoded file.

Within programming context, your programming environment (comprized by OS + system APIs + your language capabilities) defines both a set of "text" characters, and a set of encodings it is able to read to convert to these "text" characters. Please note that this may not necessarily imply ASCII, English, or 8 bits - as an example, Perl can natively read and use the full Unicode set of "characters".

To answer your specific question, you can definitely use "character" strings to transmit arbitrary byte sequences, with the caveat that string termination conventions must apply. The problem is that the functions that already exist to "process character data" would probably not have any useful functionality to deal with your binary data.

DVK 2009-09-16 19:11:13

+1 A:

Generally, it depends on the language/environment/functionality.

Binary data is always that: binary. It is transferred without modification.

"Plain text" mode may mean one or more of the following things:

the stream of bytes is split into lines. The line delimiters are \r, \n, or \r\n, or \n\r. Sometimes it is OS-dependent (like *nix likes \n, while windows likes \r\n). The line ending may be adjusted for the reading application
character encoding may be adjusted. The environment might detect and/or convert the source encoding into the encoding the application expects
probably some other conversions should be added to this list, but I can't think of any more at this moment

Rom 2009-09-16 19:12:31

+2 A:

One thing it often means is that the language might feel free to interpret certian control characters, such as the values 10 or 13, as logical line terminators. In other words, an output operation might automagicly append these characters at the end, and an input operation might strip them from the input (and/or terminate reading there).

In contrast, language I/O operations that advertise working on "binary" data will usually include an input parameter for the length of data to operate on, since there is no other way (short of reading past end of file) to know when it is done.

T.E.D. 2009-09-16 19:12:59

Suppose the function I'm supplying plaintext to, takes it as a string. Can it not measure the length before transmission, instead of relying on control chars?

Jenko 2009-09-16 19:18:03

That depends on the language. In Ada, certianly. In C, the only way to do that is to look for a line terminator (ASCII 0). That means you are unable to output that value into a file using "ASCII" I/O routines, but can using the length-based "binary" routines.

T.E.D. 2009-09-16 19:21:34

Sure, and it might add a control character (such as \r\n), or even do character set conversions to that string, if the data is treated as binary, nothing would be added or altered.

nos 2009-09-16 19:21:59

Technically nothing. Plain text is a form of binary data. However a major difference is how values are stored. Think of how an integer might be stored. In binary data it would use a two's complement format, probably taking 32 bits of space. In text format a number would be stored instead as a series of unicode digits. So the number 50 would be stored as 0x32 (padded to take up 32 bits) in binary but would be stored as '5' '0' in plain text.

G_Morgan 2009-09-16 19:59:53

ansaurus

tags:

views:

answers:

What is the difference between plaintext and binary data?

related questions