views:

10246

answers:

12

I have a single string that contains the command-line parameters to be passed to another executable and I need to extract the string[] containing the individual parameters in the same way that C# would if the commands had been specified on the command-line. The string[] will be used when executing another assemblies entry-point via reflection.

Is there a standard function for this? Or is there a preferred method (regex?) for splitting the parameters correctly? It must handle '"' delimited strings that may contain spaces correctly, so I can't just split on ' '.

Example string:

string parameterString = @"/src:""C:\tmp\Some Folder\Sub Folder"" /users:""[email protected]"" tasks:""SomeTask,Some Other Task"" -someParam foo";

Example result:

string[] parameterArray = new string[] { 
  @"/src:C:\tmp\Some Folder\Sub Folder",
  @"/users:[email protected]",
  @"tasks:SomeTask,Some Other Task",
  @"-someParam",
  @"foo"
};

I do not need a command-line parsing library, just a way to get the String[] that should be generated.

Update: I had to change the expected result to match what is actually generated by C# (removed the extra "'s in the split strings)

A: 

Yes, the string object has a built in function called Split() that takes a single parameter specifying the character to look for as a delimiter, and returns an array of strings ( string[] ) with the individual values in it

Charles Bretana
This would split the src:"C:\tmp\Some Folder\Sub Folder" portion incorrectly.
Anton
What about quotes inside the string that temporarily switch off splitting on spaces?
Daniel Earwicker
+6  A: 

Google says: C#/.NET Command Line Arguments Parser

spoulson
I do not want key-value pairs, I want the same thing the system would do to generate the string[] of parameters.
Anton
A: 

Not sure if I understood you, but is the problem that the character used as splitter, is also to be found inside the text? (Except for that it is escaped with double "?)

If so, I would create a for loop, and replace all instances where <"> is present with <|> (or another "safe" character, but make sure that it only replaces <">, and not <"">

After iterating the string, I would do as previously posted, split the string, but now on the character <|>

EDIT: For readably, I'ved added , i.e " is written as <">, since it became a bit unclear what I meant when I only wrote "" and ", or |

Israr Khan
The double ""'s are beceause its a @".." string literal, The double "'s inside the @".." string are equivalent to a \ escaped " in a normal string
Anton
"the only restriction (I beleive) is that the strings are space-delimited, unless the space uccurs within a "..." block" -> Might be shooting a bird with a bazooka, but put a boolean which goes "true" when inside a quote, and if a space is detected inside while "true", continue, else < > = <|>
Israr Khan
+1  A: 

This code project article is what I've used in the past, it's a good bit of code, but it might work.

This msdn article is the only thing I could find that explains how C# parses command line args.

Hope that helps!

Zachary Yates
I tried reflector'ing into the C# library, but it goes to a native C++ call that I don't have the code for, and can't see any way to call without p-invoking it.I also do not want a command-line parsing library, I just want the string[].
Anton
+3  A: 

The Windows command-line parser behaves just as you say, split on space unless there's a unclosed quote before it. I would recommend writing the parser yourself. Something like this maybe:

    static string[] ParseArguments(string commandLine)
    {
        char[] parmChars = commandLine.ToCharArray();
        bool inQuote = false;
        for (int index = 0; index < parmChars.Length; index++)
        {
            if (parmChars[index] == '"')
                inQuote = !inQuote;
            if (!inQuote && parmChars[index] == ' ')
                parmChars[index] = '\n';
        }
        return (new string(parmChars)).Split('\n');
    }
Jeffrey L Whitledge
I ended up with the same thing, excepy I used .Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries) in the final line in case there were extra ' 's between params. Seems to be working.
Anton
I assume Windows must have a way to escape quotes in the parameters... this algorithm does not take that into account.
rmeador
Removing blank lines, removing outside quotes, and handling escaped quotes are left as an excersize for the reader.
Jeffrey L Whitledge
+19  A: 

It annoys me that there's no function to split a string based on a function that examines each character. If there was, you could write it like this:

    public static IEnumerable<string> SplitCommandLine(string commandLine)
    {
        bool inQuotes = false;

        return commandLine.Split(c =>
                                 {
                                     if (c == '\"')
                                         inQuotes = !inQuotes;

                                     return !inQuotes && c == ' ';
                                 })
                          .Select(arg => arg.Trim().TrimMatchingQuotes('\"'))
                          .Where(arg => !string.IsNullOrEmpty(arg));
    }

Although having written that, why not write the necessary extension methods. Okay, you talked me into it...

Firstly, my own version of Split that takes a function that has to decide whether the specified character should split the string:

    public static IEnumerable<string> Split(this string str, 
                                            Func<char, bool> controller)
    {
        int nextPiece = 0;

        for (int c = 0; c < str.Length; c++)
        {
            if (controller(str[c]))
            {
                yield return str.Substring(nextPiece, c - nextPiece);
                nextPiece = c + 1;
            }
        }

        yield return str.Substring(nextPiece);
    }

It may yield some empty strings depending on the situation, but maybe that information will be useful in other cases, so I don't remove the empty entries in this function.

Secondly (and more mundanely) a little helper that will trim a matching pair of quotes from the start and end of a string. It's more fussy than the standard Trim method - it will only trim one character from each end, and it will not trim from just one end:

    public static string TrimMatchingQuotes(this string input, char quote)
    {
        if ((input.Length >= 2) && 
            (input[0] == quote) && (input[input.Length - 1] == quote))
            return input.Substring(1, input.Length - 2);

        return input;
    }

And I suppose you'll want some tests as well. Well, alright then. But this must be absolutely the last thing! First a helper function that compares the result of the split with the expected array contents:

    public static void Test(string cmdLine, params string[] args)
    {
        string[] split = SplitCommandLine(cmdLine).ToArray();

        Debug.Assert(split.Length == args.Length);

        for (int n = 0; n < split.Length; n++)
            Debug.Assert(split[n] == args[n]);
    }

Then I can write tests like this:

        Test("");
        Test("a", "a");
        Test(" abc ", "abc");
        Test("a b ", "a", "b");
        Test("a b \"c d\"", "a", "b", "c d");

Here's the test for your requirements:

        Test(@"/src:""C:\tmp\Some Folder\Sub Folder"" /users:""[email protected]"" tasks:""SomeTask,Some Other Task"" -someParam",
             @"/src:""C:\tmp\Some Folder\Sub Folder""", @"/users:""[email protected]""", @"tasks:""SomeTask,Some Other Task""", @"-someParam");

Note that the implementation has the extra feature that it will remove quotes around an argument if that makes sense (thanks to the TrimMatchingQuotes function). I believe that's part of the normal command-line interpretation.

Daniel Earwicker
I had to un-mark this as the answer because I didn't have the right expected outputs. The actual output should not have the "'s in the final array
Anton
I come to Stack Overflow to get away from requirements that change all the time! :) You could use Replace("\"", "") instead of TrimMatchingQuotes() to get rid of all quotes. But Windows supports \" to allow a quote character to be passed through. My Split function can't do that.
Daniel Earwicker
Nice one Earwicker :) Anton: This is the solution I was trying to describe to you in my earlier post, but Earwicker did a much better job in writitng it down ;) And also extened it a lot ;)
Israr Khan
Good answer Earwicker!
Sam Meldrum
a whitespace is not the only separating character for command line arguments, is it?
Louis Rhys
@Louis Rhys - I'm not sure. If that is a concern it is pretty easy to solve: use `char.IsWhiteSpace` instead of `== ' '`
Daniel Earwicker
+1  A: 

Environment.GetCommandLineArgs()

Mark Cidade
A: 

Good thread... while I knew about C#/.NET Command Line Arguments Parser, I did not know about Environment.GetCommandLineArgs()

Thanks.

david valentine
A: 

Currently, this is the code that I have:

 private String[] SplitCommandLineArgument(String argumentString)
 {
  StringBuilder translatedArguments = new StringBuilder(argumentString);
  bool escaped = false;
  for (int i = 0; i < translatedArguments.Length; i++)
  {
   if (translatedArguments[i] == '"')
   {
    escaped = !escaped;
   }
   if (translatedArguments[i] == ' ' && !escaped)
   {
    translatedArguments[i] = '\n';
   }
  }

  string[] toReturn = translatedArguments.ToString().Split(new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);
  for(int i = 0; i < toReturn.Length; i++)
  {
   toReturn[i] = RemoveMatchingQuotes(toReturn[i]);
  }
  return toReturn;
 }

 public static string RemoveMatchingQuotes(string stringToTrim)
 {
  int firstQuoteIndex = stringToTrim.IndexOf('"');
  int lastQuoteIndex = stringToTrim.LastIndexOf('"');
  while (firstQuoteIndex != lastQuoteIndex)
  {
   stringToTrim = stringToTrim.Remove(firstQuoteIndex, 1);
   stringToTrim = stringToTrim.Remove(lastQuoteIndex - 1, 1); //-1 because we've shifted the indicies left by one
   firstQuoteIndex = stringToTrim.IndexOf('"');
   lastQuoteIndex = stringToTrim.LastIndexOf('"');
  }
  return stringToTrim;
 }

It doesn't work with escaped quotes, but it works for the cases that I've come up against so far.

Anton
A: 

This is a reply to Anton's code, which do not work with escaped quotes. I modified 3 places.

  1. The constructor for StringBuilder in SplitCommandLineArguments, replacing any \" with \r
  2. In the for-loop in SplitCommandLineArguments, I now replace the \r character back to \".
  3. Changed the SplitCommandLineArgument method from private to public static.


public static string[] SplitCommandLineArgument( String argumentString )
{
    StringBuilder translatedArguments = new StringBuilder( argumentString ).Replace( "\\\"", "\r" );
    bool InsideQuote = false;
    for ( int i = 0; i < translatedArguments.Length; i++ )
    {
        if ( translatedArguments[i] == '"' )
        {
            InsideQuote = !InsideQuote;
        }
        if ( translatedArguments[i] == ' ' && !InsideQuote )
        {
            translatedArguments[i] = '\n';
        }
    }

    string[] toReturn = translatedArguments.ToString().Split( new char[] { '\n' }, StringSplitOptions.RemoveEmptyEntries );
    for ( int i = 0; i < toReturn.Length; i++ )
    {
        toReturn[i] = RemoveMatchingQuotes( toReturn[i] );
        toReturn[i] = toReturn[i].Replace( "\r", "\"" );
    }
    return toReturn;
}

public static string RemoveMatchingQuotes( string stringToTrim )
{
    int firstQuoteIndex = stringToTrim.IndexOf( '"' );
    int lastQuoteIndex = stringToTrim.LastIndexOf( '"' );
    while ( firstQuoteIndex != lastQuoteIndex )
    {
        stringToTrim = stringToTrim.Remove( firstQuoteIndex, 1 );
        stringToTrim = stringToTrim.Remove( lastQuoteIndex - 1, 1 ); //-1 because we've shifted the indicies left by one
        firstQuoteIndex = stringToTrim.IndexOf( '"' );
        lastQuoteIndex = stringToTrim.LastIndexOf( '"' );
    }
    return stringToTrim;
}
CS
+2  A: 

In addition to the good and pure managed solution by Earwicker, it may be worth mentioning, for sake of completeness, that Windows also provides the CommandLineToArgvW function for breaking up a string into an array of strings:

LPWSTR *CommandLineToArgvW(
    LPCWSTR lpCmdLine, int *pNumArgs);

Parses a Unicode command line string and returns an array of pointers to the command line arguments, along with a count of such arguments, in a way that is similar to the standard C run-time argv and argc values.

An example of calling this API from C# and unpacking the resulting string array in managed code can be found at, “Converting Command Line String to Args[] using CommandLineToArgvW() API.” Below is a slightly simpler version of the same code:

[DllImport("shell32.dll", SetLastError = true)]
static extern IntPtr CommandLineToArgvW(
    [MarshalAs(UnmanagedType.LPWStr)] string lpCmdLine, out int pNumArgs);

public static string[] CommandLineToArgs(string commandLine)
{
    int argc;
    var argv = CommandLineToArgvW(commandLine, out argc);        
    if (argv == IntPtr.Zero)
        throw new System.ComponentModel.Win32Exception();
    try
    {
        var args = new string[argc];
        for (var i = 0; i < args.Length; i++)
        {
            var p = Marshal.ReadIntPtr(argv, i * IntPtr.Size);
            args[i] = Marshal.PtrToStringUni(p);
        }

        return args;
    }
    finally
    {
        Marshal.FreeHGlobal(argv);
    }
}
Atif Aziz
This function requires that you escape the trailing backslash of a path inside quotes. "C:\Program Files\" must be "C:\Program Files\\" for this to function to parse the string correctly.
Manga Lee
A: 

I took the answer from Jeffrey L Whitledge and enhanced it a little. I do not have enough credits yet to comment on his answer.

It now supports both single and double quotes. You can use quotes in the parameters itself by using other typed quotes.

It also strips the quotes from the arguments since these do not contribute to the argument information.

    public static string[] SplitArguments(string commandLine)
    {
        var parmChars = commandLine.ToCharArray();
        var inSingleQuote = false;
        var inDoubleQuote = false;
        for (var index = 0; index < parmChars.Length; index++)
        {
            if (parmChars[index] == '"' && !inSingleQuote)
            {
                inDoubleQuote = !inDoubleQuote;
                parmChars[index] = '\n';
            }
            if (parmChars[index] == '\'' && !inDoubleQuote)
            {
                inSingleQuote = !inSingleQuote;
                parmChars[index] = '\n';
            }
            if (!inSingleQuote && !inDoubleQuote && parmChars[index] == ' ')
                parmChars[index] = '\n';
        }
        return (new string(parmChars)).Split(new[] { '\n' }, StringSplitOptions.RemoveEmptyEntries);
    }
Vapour in the Alley