tags:

views:

256

answers:

2

How can I escape an unknown string for passing to Process.Start as an argument?

I currently escape basic quotes and backslashes, but recently my input has started to contain things like http://www.fileformat.info/info/unicode/char/ff02/index.htm (Fullwidth quotation mark).

So my question is, what all do I need to escape to safely pass a string as an argument for Process.Start?

Edit: So I need to clarify this. What I really am looking for is a list of all characters that have to be escaped in a quoted string ("foo") for cmd.exe. I originally dealt with double quote character as well as backslash character, but I finally had some input that contained a fullwidth quotation mark (as referenced above) which also needed to be escaped. So the question is, what else do I need to escape for a quoted string argument passed to cmd.exe with Process.Start?

+1  A: 

This might be useful:

  • First, multiple arguments are normally separated from one another by spaces. In Figure 2.3, the command has three arguments, c:*.bak, e:\backup, and /s. Occasionally, other characters are used as argument separators. For example, the COPY command can use + characters to separate multiple filenames.

  • Second, any argument that contains spaces or begins or ends with spaces must be enclosed in double quotes. This is particularly important when using long file and directory names, which frequently contain one or more spaces. If a double-quoted argument itself contains a double quote character, the double quote must be doubled. For example, enter "Quoted" Argument as """Quoted"" Argument".

  • Third, command switches always begin with a slash / character. A switch is an argument that modifies the operation of the command in some way. Occasionally, switches begin with a + or - character. Some switches are global, and affect the command regardless of their position in the argument list. Other switches are local, and affect specific arguments (such as the one immediately preceding the switch).

  • Fourth, all reserved shell characters not in double quotes must be escaped. These characters have special meaning to the Windows NT command shell. The reserved shell characters are:

    & | ( ) < > ^

To pass reserved shell characters as part of an argument for a command, either the entire argument must be enclosed in double quotes, or the reserved character must be escaped. Prefix a reserved character with a carat (^) character to escape it. For example, the following command example will not work as expected, because < and > are reserved shell characters:

  1. C:\>echo <dir>
  2. The syntax of the command is incorrect.

  Instead, escape the two reserved characters, as follows:

  1. C:\>echo ^<dir^>
  2. <dir>

Typically, the reserved shell characters are not used in commands, so collisions that require the use of escapes are rare. They do occur, however. For example, the popular PKZIP program supports a -& switch to enable disk spanning. To use this switch correctly under Windows NT, -^& must be typed.

Tip: The carat character is itself a reserved shell character. Thus, to type a carat character as part of a command argument, type two carats instead. Escaping is necessary only when the normal shell interpretation of reserved characters must be bypassed.

  • Finally, the maximum allowed length of a shell command appears to be undocumented by Microsoft. Simple testing shows that the Windows NT command shell allows very long commands—in excess of 4,000 characters. Practically speaking, there is no significant upper limit to the length of a command.

http://technet.microsoft.com/en-us/library/cc723564.aspx

Zyphrax
This helps a bit, but what I'm really looking for is: when in a quoted string, what all needs to be escaped within the quotes.
thelsdj
As far as I can tell: you only have to escape the fullwidth quotation mark within the fullwidth quotation marks (by doubling them). Step four of the article mentions reserved characters "NOT in double quotes", I assume that you CAN use the characters within the double quotes. And I wouldn't use non-ASCII characters.
Zyphrax