views:

230

answers:

6

Is it possible to get strings, ints, etc in binary format? What I mean is that assume I have the string:

"Hello" and I want to store it in binary format, so assume "Hello" is

11110000110011001111111100000000 in binary (I know it not, I just typed something quickly).

Can I store the above binary not as a string, but in the actual format with the bits.

In addition to this, is it actually possible to store less than 8 bits. What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.

+2  A: 

What encoding would you be assuming?

Gurdas Nijor
this should really be a comment
jmein
Agreed, don't have the rep for that however ;)
Gurdas Nijor
There, now you do. :P
James Jones
Huffman encoding is the assumption.
Xaisoft
+1  A: 

You can use things like:

Convert.ToBytes(1);
ASCII.GetBytes("text");
Unicode.GetBytes("text");

Once you have the bytes, you can do all the bit twiddling you want. You would need an algorithm of some sort before we can give you much more useful information.

John Fisher
+2  A: 

What I am getting at is if the letter A is the most frequent letter used in a text, can I use 1 bit to store it with regards to compression instead of building a binary tree.

The algorithm you're describing is known as Huffman coding. To relate to your example, if 'A' appears frequently in the data, then the algorithm will represent 'A' as simply 1. If 'B' also appears frequently (but less frequently than A), the algorithm usually would represent 'B' as 01. Then, the rest of the characters would be 00xxxxx... etc.

In essence, the algorithm performs statistical analysis on the data and generates a code that will give you the most compression.

James Jones
We had to write "Huff" and "Puff" in college, brings back memories.
Joel Coehoorn
So its it actually possible to store this representation without having to build a binary tree.
Xaisoft
Storing the representation does not require building a binary tree. However, the code is generally represented *visually* as a binary tree because it's easier to read that way.
James Jones
A: 

The string is actually stored in binary format, as are all strings.

The difference between a string and another data type is that when your program displays the string, it retrieves the binary and shows the corresponding (ASCII) characters.

If you were to store data in a compressed format, you would need to assign more than 1 bit per character. How else would you identify which character is the mose frequent?

If 1 represents an 'A', what does 0 mean? all the other characters?

pavium
+2  A: 

What you are looking for is something like Huffman coding, it's used to represent more common values with a shorter bit pattern.

How you store the bit codes is still limited to whole bytes. There is no data type that uses less than a byte. The way that you store variable width bit values is to pack them end to end in a byte array. That way you have a stream of bit values, but that also means that you can only read the stream from start to end, there is no random access to the values like you have with the byte values in a byte array.

Guffa
Why the downvote? If you don't say what it is that you dislike, it's rather pointless...
Guffa
+2  A: 

Is it possible to get strings, ints, etc in binary format?

Yes. There are several different methods for doing so. One common method is to make a MemoryStream out of an array of bytes, and then make a BinaryWriter on top of that memory stream, and then write ints, bools, chars, strings, whatever, to the BinaryWriter. That will fill the array with the bytes that represent the data you wrote. There are other ways to do this too.

Can I store the above binary not as a string, but in the actual format with the bits.

Sure, you can store an array of bytes.

is it actually possible to store less than 8 bits.

No. The smallest unit of storage in C# is a byte. However, there are classes that will let you treat an array of bytes as an array of bits. You should read about the BitArray class.

Eric Lippert
Thanks, I will look up on your suggestions.
Xaisoft