views:

25

answers:

1

I have a .Net application that fires up a process, passing a long argument list through Process.StartInfo.Arguments. The new process can only handle 8-bit characters in the arguments passed to its main() function. Therefore, I've encoded the string in Process.StartInfo.Arguments so that each character is an 8-bit value.

The problem is, the new process doesn't see the same 8-bit values that I've used. For values less than 128, the value pass through unmolested. Other values get changed somehow, and in fact, the argument list seen by the new process is often longer than what I'd passed in.

What encoding is being used to translate the arguments as they're passed to the new process? Can I modify that encoding?

I see the encodings associated with the process' standard output and standard error; I assume those are irrelevant.

+1  A: 

This is not something you can fix in .NET code. The .NET Process class as well as Windows uses Unicode. The conversion from Unicode to an 8-bit char string happens inside the C Runtime Library embedded in the program you started. That conversion is based on the current system code page, it uses the WideCharToMultiByte() API function with CodePage = CP_ACP. There is no way to change that conversion, short from changing the system code page. Which has drastic effects on the entire operating system.

Of course this is a lossy conversion, it can only handle characters that are defined in the code page. If you pass it an argument that contains a Unicode character that cannot be represented in the code page then the program will see a question mark in that string. No amount of string manipulation that you could do in your .NET code can prevent this, short from omitting or substituting that character. But then you are not passing it the same string anymore.

Hans Passant