views:

588

answers:

2

Is it possible to provide Unicode input to a console app, and read the Unicode char/string via Console.ReadKey()?

I know Unicode works when reading the input via other methods, but unfortunately I need to use the 'interception' feature provided by ReadKey.

Update:

When pasting a Unicode character such as U+03BB (λ) into the console, 3 keys are read.

  1. Alt + NumPad1
  2. Alt + NumPad1
  3. Alt + NumPad8

I have tried to see if this is some kind of encoding, but can not see anything.

A: 

The ConsoleKeyInfo object returned by Console.ReadKey() has a property called KeyChar containing the Unicode char of the pressed key or key combination (if the key or key combination has a Unicode equivalent). So...

char c = Console.ReadKey().KeyChar;

You'll get a '\0' char if the key doesn't have a Unicode equivalent (for example, a function key).

You can use a StringBuilder to concatenate these chars together into a Unicode string if necessary.

Eric Rosenberger
Yes, you would expect that, but it does not return Unicode, only works for ASCII.
leppie
+2  A: 

Unfortunately, Console.ReadKey is only able to process keyboard events. Keyboard events can only represent things that can be typed on the keyboard (using the real and "virtual" keys defined in the ConsoleKey enumeration). So when using ReadKey you will only get two things: a raw key code, which corresponds to a key on the keyboard, and the translated character, which is the Unicode character that the raw key code maps to in the console's input code page (and each code page can map a maximum of 256 characters). You cannot read any other type of data (namely characters that cannot be directly typed and/or do not have a mapping in the input code page) with ReadKey.

Moreover, when you paste a Unicode character into the console, the API used by ReadKey attempts to translate the character into a Windows ALT+nnn sequence (i.e., hold down ALT and type the code point number on the keypad). Unfortunately, it translates the character first, using the rules defined for the input code page, so even if you reconstitute the code point number you won't get the actual character that was pasted, you'll get whatever character the code page maps it to.

The reason it all works when using Read or ReadLine is that these are stream-based, rather than keyboard-based, methods. Obviously any character whatsoever can come in via the input stream, since there is no keyboard and code page translation happening. But you cannot get at the input stream directly using ReadKey, only the keyboard (and if the input stream has been redirected from somewhere other than the keyboard, ReadKey will fail outright).

There may be some way to replicate the "intercept" functionality of ReadKey using the input stream if you manually use the console API with P/Invoke, but it would be nontrivial, and the console isn't really designed to do that sort of thing so you'd probably be fighting it the whole way.


Edit: All that said, you could still implement your own key combinations to allow Unicode characters to be entered via the keyboard -- such as type CTRL+ALT+U and then four hex digits -- your ReadKey routine could detect the CTRL+ALT+U and then grab the next four keystrokes and make an int out of them and convert it into a char -- but of course this wouldn't allow for pasting.

Eric Rosenberger
Thanks for the explanation :)
leppie