As well as CHAR (CHARACTER) and VARCHAR (CHARACTER VARYING), SQL offers an NCHAR (NATIONAL CHARACTER) and NVARCHAR (NATIONAL CHARACTER VARYING) type. In some databases, this is the better datatype to use for character (non-binary) strings:
In SQL Server,
NCHARis stored as UTF-16LE and is the only way to reliably store non-ASCII characters,CHARbeing a single-byte codepage only;in Oracle,
NVARCHARmay be stored as UTF-16 or UTF-8 rather than a single-byte collation;but in MySQL,
NVARCHARisVARCHAR, so it makes no difference, either type can be stored with UTF-8 or any other collation.
So, what does NATIONAL actually conceptually mean, if anything? The vendors' docs only tell you about what character sets their own DBMSs use, rather than the actual rationale. Meanwhile the SQL92 standard explains the feature even less helpfully, stating only that NATIONAL CHARACTER is stored in an implementation-defined character set. As opposed to a mere CHARACTER, which is stored in an implementation-defined character set. Which might be a different implementation-defined character set. Or not.
Thanks, ANSI. Thansi.
Should one use NVARCHAR for all character (non-binary) storage purposes? Are there currently-popular DBMSs in which it will do something undesirable, or which just don't recognise the keyword (or N'' literals)?