My .NET library has to marshal strings to a C library that expects text encoded using the system's default ANSI code page. Since .NET supports Unicode, this makes it possible for users to pass a string to the library that doesn't properly convert to ANSI. For example, on an English machine, "デスクトップ" will turn in to "?????" when passed to the C library.
To address this, I wrote a method that detects when this will happen by comparing the orginal string to a string converted using the ANSI code page. I'd like to test this method, but I really need a string that's guaranteed to be not encodable. For example, we test our code on English and Japanese machines (among other languages.) If I write the test to use the Japanese string above, the test will fail when the Japanese system properly encodes the string. I could write the test to check the current system's encoding, but then I have a maintenance nightmare every time we add/remove a new language.
Is there a unicode character that doesn't encode with any ANSI code page? Failing that, could a string be constructed with characters from enough different code pages to guarantee failure? My first attempt was to use Chinese characters since we don't cover Chinese, but apparently Japanese can convert the Chinese characters I tried.
edit I'm going to accept the answer that proposes a Georgian string for now, but was really expecting a result with a smattering of characters from different languages. I don't know if we plan on supporting Georgian so it seems OK for now. Now I have to test it on each language. Joy!