views:

202

answers:

6

Why is a bit, called a bit. Why is a 8-bits a Byte? What made people call a 16-bits a Word, and so on. Where and why did their alias come about?

I would love other people to include things like basic ASM types, then branch out to C/C++ and move on to SQL and the like's datatypes.

  1. 1-Bit
    1. Bit - binary Unit
    2. Bool - Named after the inventor of boolen logic George Boole.
  2. 4-Bits
    1. Nibble - Half the size of bite/byte.
  3. 8-Bits
    1. Byte - Coined from "bite" but respelled to avoid accidental mutation to "bit".
    2. Char
    3. Octet - Is a grouping of eight bits, from the Latin "octo" meaning "eight".
  4. 16-Bits
    1. Word (unsigned integer)
    2. short (signed integer)
  5. 32-Bits
    1. Double Word
    2. int (signed integers)
    3. unsigned (unsigned integer)
    4. float (4-byte float)
+2  A: 
  • A bit is a binary digit.
  • A float should be clear (floating point semantics)

The rest, I could only guess

Space_C0wb0y
So it's called a `bit` because it's a `b`inary dig`it`?
Mark Tomlin
@Mark: That's right. Check the link in @Aarons answer.
Space_C0wb0y
I always thought it was `bi`nary digi`t` ;)
jalf
+13  A: 

Wikipedia is your friend:

  • bit
  • nibble
  • byte
  • "char" is just short for "character"
  • "short" is an alias for "short int"
  • word "is the native or most efficient size the CPU can handle" (thanks to Tony for pointing that out).
  • "int" is short for "integer". The size is undefined (can be 16, 32 or 64 bits).
  • "float" is short for "floating point number"
  • "double" is short for "double precision floating point number"
Aaron Digulla
I think this is proof that you can ask a stupid question and get an intelligent answer on here. Thank you @Aaron.
Mark Tomlin
I understand "word" to be the native or most efficient size the CPU can handle - an untyped amount of memory. This may be compared and contrasted to int, which though numeric was also initially meant to be sized based on maximum utility (range) without compromising efficiency, but given CPUs sometimes have a couple equally efficient sizes, and certain restrictions and expectations have been put on int (esp. given long, long long), the two may actually differ in size. not necessarily the smallest. Anything to support your assertion? I can cite http://en.wikipedia.org/wiki/Word_(computing)
Tony
@Tony, agreed. I'd like to see a source for this as well. Quoting Donald Knuth would work, if he has written about it. I understand that word is used to denote the smallest addressable unit of data by a CPU, but why is it called a word then? Is it because humans communicate in words and letters (bits) make up those words. I'm looking for the foundation of these names.
Mark Tomlin
@Mark: "agreed... word...denote[s] the smallest" <-- that's exactly what I said it wasn't! :-). I can only guess reasons: a binary-digit (bit) is rarely meaningful independent of other things around it (unless it's a boolean, but even then you often have to extract, mask and/or shift the _right_ bit into a convenient word-sized register). But words are often numerically meaningful at the problem-domain level: big enough for the range of numbers in most common use - discrete at a meaning level - like a human-language _word_, vs. a syllable that can't be interpreted independently.
Tony
@Mark: a word isn't the smallest addressable unit. (That's usually a byte). If you want a counterexample, the x86 CPU you're using right now defines a word to be 16 bits wide, but the smallest addressable unit is still a 8-bit byte (or an octet).
jalf
@Mark Tomlin: There are no stupid questions, only stupid answers. If a toddler asks "What is 1+1?" is that stupid? :-)
Aaron Digulla
@Tony: Thanks, fixed.
Aaron Digulla
+3  A: 

One that Aaron forgot was Bool: This goes back to the logician Boole, who is attributed the invention of "boolean" logic.

jdv
http://www.suite101.com/content/george-boole-biography-a212610
Mark Tomlin
Here is the link: http://en.wikipedia.org/wiki/Boolean
Péter Török
+1  A: 

I always thought 8 bits is called Octet, you live and learn. ;)

tiredcoder
Is'nt it telecom people that call bytes octets?
jdv
@Jdv, Would octet not be the most correct name for it? And that brings us around to IP address what use 4 bytes (actually called, octets in this case) to make up an IP address.
Mark Tomlin
@jdv: the french seem to be doing it a lot.
snemarch
@snemarch, so it's a cultural thing? Does that mean the inventor of the IP address was French?
Mark Tomlin
@Mark: probably also differs wrt. field of work - and my comment on the frenchies is based entirely on empirical observation of forums, mailing-lists and various pieces of software :)
snemarch
An octet is an unambiguous term for an ordered set of 8 bits. A byte is the smallest addressable unit on a computer which is almost always also 8 bits nowadays. In network protocols, octet is used in order to be completely unambiguous because you are by definition not talking about a specific computer architecture. For instance, if you defined an IP address as 4 bytes instead of 4 octets, a PDP 10 systems programmer would write you a network stack with 36 bit IP addresses.
JeremyP
@JeremyP, great observation.
Mark Tomlin
Yeah, octet is the most "proper" name for a 8-bit unit. A byte is *usually* the same size, so most people conflate the two, but if you want to be specific, a byte is the smallest addressable unit (which is *usually* 8 bits on modern hardware), and an octet is a group of 8 bits. So in the end the size is usually the same, but they're derived differently
jalf
Keep in mind; In C(the standard), a byte is the char data type. However, in C , a byte need not be 8 bits, it's CHAR_BIT bits.
nos
A: 

Probably you may ask: Why is m called as meter? Why is 1km represented by 1000m?

Tought question... Think it in simple way. Don't get yourself into tense.

wengseng
Why is the scientific unit for measuring a meter `m`? Because, m is the first letter of meter, and m was not resurved by any other unit of measure. Kilo, (symbol 'k' lowercase) is a unit prefix in the metric system denoting one thousand.
Mark Tomlin
A: 

Your convention of short/int/long/word/dword for signed is not just an x86-ism; it's a Windows-ism (SHORT/LONG/WORD/DWORD). I don't see why Windows programmers like them so much when the standard (u)intN_t types are more clear to pretty much everyone.

I don't think x86 naturally comes with "word" and "double word"; registers are al,ah (8-bit), ax (16-bit), eax (32-bit). I forget how you specify the size of a memory-memory move, though.

M68K instructions have .b (byte), .w (word), and .l (long) suffixes. No double/quad-word IIRC.

ARM has ldb (byte), ldh (halfword), ldr (register).

PPC has byte, halfword, word, and doubleword IIRC.

In general, it's pretty meaningless to talk about "word size", since it's highly architecture-dependent, and even then it tends to change (I doubt that modern x86 implements 16-bit arithmetic any faster than 32-bit arithmetic).

Then there's also the "pointer size" definition, but amd64 only has 48-bit virtual addresses (the top 17 bits are supposed to be all 1 or all 0).

tc.
I don't know why some one downed voted you, because I found this quite insightful. So +1 to you.
Mark Tomlin