tags:

views:

2559

answers:

8

I want to split a command line like string in single string parameters. How look the regular expression for it. The problem are that the parameters can be quoted. For example like:

"param 1" param2 "param 3"

should result in:

param 1, param2, param 3

+3  A: 

What programming language are you using? In C and C++, the arguments are already stored in the argv parameter to main. In shell scripts this is also already done for you.

QuantumPete

QuantumPete
He said command line 'like' not a command line String.
GavinCattell
+3  A: 

Without regard to implementation language, your regex might look something like this:

("[^"]*"|[^"]+)(\s+|$)

The first part "[^"]*" looks for a quoted string that doesn't contain embedded quotes, and the second part [^"]+ looks for a sequence of non-quote characters. The \s+ matches a separating sequence of spaces, and $ matches the end of the string.

Greg Hewgill
In which regex dialect does a|$ work? It has to be \z
Vinko Vrsalovic
That's always worked for me in Python and Perl.
Greg Hewgill
True, I got confused.
Vinko Vrsalovic
Failure case for your regex: <" " param2 "" bozo "ninny" "param 3"> Notice 1) quotes left in the answer 2) includes trailing whitespace after bozo. Probably has other bugs too.
Sorry these comments trim out whitespace - there should be lots of spaces after "bozo".
+3  A: 

I tend to use regexlib for this kind of problem. If you go to: http://regexlib.com/ and search for "command line" you'll find three results which look like they are trying to solve this or similar problems - should be a good start.

This may work: http://regexlib.com/Search.aspx?k=command+line&amp;c=-1&amp;m=-1&amp;ps=20

Sam Meldrum
A: 

Something like:

"(?:(?<=")([^"]+)"\s*)|\s*([^"\s]+)

or a simpler one:

"([^"]+)"|\s*([^"\s]+)

(just for the sake of finding a regexp ;) )

Apply it several time, and the group n°1 will give you the parameter, whether it is surrounded by double quotes or not.

VonC
+2  A: 

Most languages have other functions (either built-in or provided by a standard library) which will parse command lines far more easily than building your own regex, plus you know they'll do it accurately out of the box. If you edit your post to identify the language that you're using, I'm sure someone here will be able to point you at the one used in that language.

Regexes are very powerful tools and useful for a wide range of things, but there are also many problems for which they are not the best solution. This is one of them.

Dave Sherohman
+4  A: 

You should not use regular expressions for this. Write a parser instead, or use one provided by your language.

hop
I agree. This would be a better solution, especially if you need to put quotes inside the string: "param""1" param2...
rslite
+1 - like parsing XML, this is not a good problem for regexes.
slim
Absolute nonsense. This is a simple problem for regexes, and it has nothing in common with parsing XML.
A: 

(reading your question again, just prior to posting I note you say command line LIKE string, thus this information may not be useful to you, but as I have written it I will post anyway - please disregard if I have missunderstood your question.)

If you clarify your question I will try to help but from the general comments you have made i would say dont do that :-), you are asking for a regexp to split a series of parmeters into an array. Instead of doing this yourself I would strongly suggest you consider using getopt, there are versions of this library for most programming languages. Getopt will do what you are asking and scales to manage much more sophisticated argument processing should you require that in the future.

If you let me know what language you are using I will try and post a sample for you.

Here are a sample of the home pages:

http://www.codeplex.com/getopt (.NET)

http://www.urbanophile.com/arenn/hacking/download.html (java)

A sample (from the java page above)

 Getopt g = new Getopt("testprog", argv, "ab:c::d");
 //
 int c;
 String arg;
 while ((c = g.getopt()) != -1)
   {
     switch(c)
       {
          case 'a':
          case 'd':
            System.out.print("You picked " + (char)c + "\n");
            break;
            //
          case 'b':
          case 'c':
            arg = g.getOptarg();
            System.out.print("You picked " + (char)c + 
                             " with an argument of " +
                             ((arg != null) ? arg : "null") + "\n");
            break;
            //
          case '?':
            break; // getopt() already printed an error
            //
          default:
            System.out.print("getopt() returned " + c + "\n");
       }
   }
Scott James
A: 

If its just the quotes you are worried about, then just write a simple loop to dump character by character to a string ignoring the quotes.

Alternatively if you are using some string manipulation library, you can use it to remove all quotes and then concatenate them.

Sridhar Iyer