views:

170

answers:

4

I'm trying to parse a command-line in Python which looks like the following:

$ ./command -o option1 arg1 -o option2 arg2 arg3

In other words, the command takes an unlimited number of arguments, and each argument may optionally be preceded with an -o option, which relates specifically to that argument. I think this is called a "prefix notation".

In the Bourne shell I would do something like the following:

while test -n "$1"
do
    if test "$1" = '-o'
    then
        option="$2"
        shift 2
    fi
    # Work with $1 (the argument) and $option (the option)
    # ...
    shift
done

Looking around at the Bash tutorials, etc. this seems to be the accepted idiom, so I'm guessing Bash is optimized to work with command-line arguments this way.

Trying to implement this pattern in Python, my first guess was to use pop(), as this is basically a stack operation. But I'm guessing this won't work as well on Python because the list of arguments in sys.argv is in the wrong order and would have to be processed like a queue (i.e. pop from the left). I've read that lists are not optimized for use as queues in Python.

So, my ideas are: convert argv to a collections.deque and use popleft(), reverse argv using reverse() and use pop(), or maybe just work with the int list indices themselves.

Does anyone know of a better way to do this, otherwise which of my ideas would be best-practise in Python?

+1  A: 

You can do

argv.pop(0)

which pulls off the first element and returns it. That's probably inefficient, though. Maybe. I'm not sure how argv is implemented under the hood. (Then again, if efficiency is that important, why are you using Python?)

A more Pythonic solution, though, would be to iterate through the list without popping the elements. Like so:

o_flag = False
for a in argv:
    if a == '-o':
        o_flag = True
        continue
    # do whatever
    o_flag = False

Also, I think the optparse module deserves a mention; it's pretty standard for handling options and arguments in Python programs, although it might be overkill for this task since you already have several perfectly functional solutions.

David Zaslavsky
You're right; I could easily implement this pattern in C if performance was most important.I guess my question is more about how argv is implemented and how best to access it as a stack (because it must be a stack internally).
ejm
My guess would be a standard C array, although as I said, I don't know for sure. You could always take a look at the Python interpreter's source code and try to track down the array implementation, if you're curious. But for use in practice, it really doesn't matter.
David Zaslavsky
A: 

Something like:

for arg in sys.argv[1:]:
  # do something with arg

should work well for you. Unless you are expecting an extremely large number of arguments, I would go for whatever code is most simple (and not worry too much about performance). The argv[1:] ignores that first argv value, which will be the name of the script.

jkasnicki
+1  A: 

No need to reinvent the wheel: the getopt module is designed for exactly this. If that doesn't suit your needs, try the optparse module, which is more flexible but more complicated.

Etaoin
Thanks for your answer. Could you please give an example of how to implement the "prefix notation" pattern with getopt or optparse. I've used both of these before but I thought you could only parse "options" and then "non-options", not option-argument, option-argument, ... the way I need to.
ejm
aren't non option args returned as a list? from parser.parse_args() ?
iondiode
+2  A: 

another stdlib module: argparse

p = argparse.ArgumentParser()
p.add_argument('-o', action='append')
for i in range(1, 4): p.add_argument('arg%d' % i)
args = p.parse_args('-o option1 arg1 -o option2 arg2 arg3'.split())
print args
# -> Namespace(arg1='arg1', arg2='arg2', arg3='arg3', o=['option1', 'option2'])
J.F. Sebastian
I wish I knew about argparse. Optparse is nice but if you want your argument to have multiple values, you have to make your own class.Why so many option parsers in the stdlib?
Austin