tags:

views:

132

answers:

3

Why does sql database use UTF-8 Encoding? do they both use 8-bit to store a character?

+1  A: 

For "normal" characters, only 8 bits are used. For characters that do not fit in 8 bits more bits can be used. This makes UTF-8 is a variable length encoding.

Wikipedia has a good article on UTF-8.

ASCII only defines 128 character. So only 7 bits. But is normally stored with 8 bits/character. RS232 (old serial communication) can be used with bytes of 7 bits.

GvS
The amount of space used depends on the character code. For some characters, it'll use up to 4 bytes for a single char. The ones that use that much space aren't in the BMP, though.
cHao
A: 

ASCII can only represent a limited number of characters at one time. It isn't very useful to represent any language that isn't based on a Latin character set. However, UTF-8 which is an encoding standard for UCS-4 (Unicode) can represent almost any language. It does this by chaining multiple bytes together to represent one character (or glyph to be more correct).

Torlack
Isn't it UTF-16 what you are saying?
Microkernel
What do you mean? UTF-16 is just another encoding standard. UTF-7,8,16,32 are all encoding standards that encode UCS-4 (this wasn't always the case, and I wouldn't be shocked if there are still some exceptions)
Torlack
+1  A: 

UTF-8 is used to support a large range of characters. In UTF-8, up to 4 bytes can be used to represent a single character.

Joel has written an article on this subject that you may want to refer to

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Bo Tian