Updated: "update 1", "update 2"
You can store 8 Bits in a single charakter with ANSI, ASCII or UFT-8 encoding.
But, for exampel, if you whant to use ASCII-Encoding you should't use the first 5 bits (0001 1111 = 0x1F) and the chars 0x7F there are represent system-charaters like "Escape, null, start of text, end of text ..) who are not can be copy and past. So you could store 223 (1110 0000 = 0xE0) different informations in one singel charakter.
If you use UTF-16 you have 2 bytes = 16 bits - system-characters to store your informationen.
A in UTF-8 Encoding: 0x0041 (the first 2 digits are every 0!) or 0x41
A in UTF-16 Encoding: 0x0041 (the first 2 digits can be higher then 0)
A in ASCII Encoding: 0x41
A in ANSI Encoding: 0x41
see images at the and of this post!
update 1:
if you not need to modify the values without any tool (c#-tool, javascript-base webpage, ...) you can alternative base64 or zip+base64 your informationens. this solution avoid the problem that you descript in your 2nd update. "here are lots of control characters that I cannot use. How could I manage which characters are ok to use?"
If this is not an option you can not avoid to use any type of lookup-table.
the shortest way for an lookuptable are:
var illegalCharCodes = new byte[]{0x00, 0x01, 0x02, ..., 0x1f, 0x7f};
or you code it like this:
//The excampel based on ASNI-Encoding but in principle its the same with utf-16
var value = 0;
if(charcode > 0x7f)
value = charcode - 0x1f - 1; //-1 because 0x7f is the first illegalCharCode higher then 0x1f
else
value = charcode - 0x1f;
value -= 1; //because you need a 0 value;
//charcode: 0x20 (' ') -> value: 0
//charcode: 0x21 ('!') -> value: 1
//charcode: 0x22 ('"') -> value: 2
//charcode: 0x7e ('~') -> value: 94
//charcode: 0x80 ('€') -> value: 95
//charcode: 0x81 ('�') -> value: 96
//..
update 2:
for Unicode (UTF-16) you can use this table: http://www.tamasoft.co.jp/en/general-info/unicode.html
Any character represent with a symbol like or are empty you should not use.
So you can not store 50,000 possible values in one utf-16 character if you allow to copy and past them. you need any spezial-encoder and you must use 2 UTF-16 character like:
//charcode: 0x0020 + 0x0020 (' ') > value: 0
//charcode: 0x0020 + 0x0020 (' !') > value: 2
//charcode: 0x0020 + 0x0020 ('!A') > value: something higher 40.000, i dont know excatly because i dont have count the illegal characters in UTF-16 :D