Except for the Basic Character Set as you mentioned, all of the rest of the character sets are implementation-defined. That means that they could be anything, but the implementation (that is, the C compiler/libraries/toolchain implementation) must document those decisions. The key paragraphs here are:
§3.4.1 implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
§3.4.2 locale-specific behavior
behavior that depends on local conventions of nationality, culture, and language that each implementation documents
§5.2.1.1 Character sets
Two sets of characters and their associated collating sequences shall be defined: the set in which source files are written (the source character set), and the set interpreted in the execution environment (the execution character set). Each set is further divided into a basic character set, whose contents are given by this subclause, and a set of zero or more locale-specific members (which are not members of the basic character set) called extended characters. The combined set is also called the extended character set. The values of the members of the execution character set are implementation-defined.
So, look at your C compiler's documentation to find out what the other character sets are. For example, in my man page for gcc, some of the command line options state:
-fexec-charset=charset
Set the execution character set, used for string and character
constants. The default is UTF-8. charset can be any encoding
supported by the system's "iconv" library routine.
-fwide-exec-charset=charset
Set the wide execution character set, used for wide string and
character constants. The default is UTF-32 or UTF-16, whichever
corresponds to the width of "wchar_t". As with -fexec-charset,
charset can be any encoding supported by the system's "iconv"
library routine; however, you will have problems with encodings
that do not fit exactly in "wchar_t".
-finput-charset=charset
Set the input character set, used for translation from the
character set of the input file to the source character set used by
GCC. If the locale does not specify, or GCC cannot get this
information from the locale, the default is UTF-8. This can be
overridden by either the locale or this command line option.
Currently the command line option takes precedence if there's a
conflict. charset can be any encoding supported by the system's
"iconv" library routine.
To get a list of the encodings supported by iconv
, run iconv -l
. My system has 143 different encodings to choose from.