I used to think that each memory location contains 8, 16, 32 or 64 bits. So 0101 would be stored in an 8 bit machine as 00000101 (sign extended if it was negative). This was all fine and dandy until I wrote a program in java out of curiosity to find out some more inner workings of this system.
The method in question looks like this:
public void printBinaryRep(File f){
try{
FileInputStream inputStream = new FileInputStream(f);
int next = 0;
byte b = 0;
while((next = inputStream.read()) != -1){
b = (byte)next;
System.out.println((char)next + " : "+Integer.toBinaryString(next));
}
inputStream.close();
}
catch(Exception e){System.out.println(e);}
}
I got this output from a file that says Hello World
H : 1001000
e : 1100101
l : 1101100
l : 1101100
o : 1101111
: 100000
W : 1010111
o : 1101111
r : 1110010
l : 1101100
d : 1100100
All of it looks fine except for the space. It has 6 bits instead of 8. I'm now wondering how all of that information is stored in memory. If all of it was stored in 8 bit chunks, like
Hello: 10010001100101110110011011001101111
Then you can simply look at each 8 bit chunk and figure out what number it's representing (and then what ASCII code it's referring to). How does it work when a different sized character (like the 6 bit space and the 4 bit /n ) is stored along with them?? Then wouldn't storing a small number in a large bit space waste a lot of bits?
I think I have some of the fundamental understanding wrong (or maybe the program's wrong somewhere...). Sorry if the question sounds strange or too un-necessarily in-depth. I just want to know. I've done some googling, but it didn't come up with anything relevent. If you can let me know where I've gone wrong or point me in the right direction, I'd greatly appreciate it. Thanks!