views:

337

answers:

10

I'm writing a script in Python that will allow the user to input a string, which will be a command that instructs the script to perform a specific action. For the sake of argument, I'll say my command list is:

lock
read
write
request
log

Now, I want the user to be able to enter the word "log" and it will peform a specific action, which is very simple. However, I would like to match partial words. So, for example, if a user enters "lo", it should match "lock", as it's higher in the list. I've tried using strncmp from libc using ctypes to accomplish this, but have yet to make heads or tails of it.

+2  A: 

you can use startswith

eg

myword = "lock"
if myword.startswith("lo"):
   print "ok"

or if you want to find "lo" in the word, regardless of position, just use the "in" operator

if "lo" in myword

therefore, one way you can do this:

for cmd in ["lock","read","write","request","log"]:
    if cmd.startswith(userinput):
        print cmd
        break
ghostdog74
@ghostdog74, better read in more detail: "I would like to match partial words. So, for example, if a user enters "lo", it should match "lock", as it's higher in the list."
Peter Hansen
Peter Hansen is correct. Partial words will need to be matched to make the system easier to use. I will (Eventually) have some complex commands, and being able to abbreviate them to a single letter is very convenient.
Mike Trpcic
+13  A: 

If you are accepting input from a user, then why are you worried about the speed of comparison? Even the slowest technique will be far faster than the user can perceive. Use the simplest most understandable code you can, and leave efficiency concerns for tight inner loops.

cmds = [
    "lock",
    "read",
    "write",
    "request",
    "log",
    ]

def match_cmd(s):
    matched = [c for c in cmds if c.startswith(s)]
    if matched:
        return matched[0]
Ned Batchelder
+1 good point about using simplest most understandable code
Lukman
wish I could upvote that 10x. That's why I love UI development so much -- time scales are absolutely (well, relatively) enormous. You can take a leisurly 100ms to do something and nobody notices.
Bryan Oakley
+1  A: 

jaro_winkler() in python-Levenshtein might be what you're looking for.

Ignacio Vazquez-Abrams
+5  A: 

This will do what you want:

def select_command(commands, user_input):
    user_input = user_input.strip().lower()
    for command in commands:
        if command.startswith(user_input):
            return command
    return None

However:

You seem overworried about the wrong thing. So 50 users means 50 milliseconds -- you're not going to be run out of town for that kind of "lag". Worry about inefficient database access or problems caused by users typing "r" and getting "read" when they thought they'd get "request". Minimising user keystrokes at the risk of errors is so 1960s that it's not funny. What are they using? ASR33 teletypes? At the very least you could insist on a unique match -- "rea" for read and "req" for request.

John Machin
A: 

Replace with your favorite string compare function. Fairly fast, and to the point.

matches = ( x for x in list if x[:len(stringToSearchFor)] == stringToSearchFor )
print matches[0]
Bear
(1) see `http://docs.python.org/library/stdtypes.html#str.startswith`(2) don't use `list`; it will shadow the `list()` built-in.
John Machin
+2  A: 

I suggest you look at using the readline python library, rather than reinventing the wheel. The user will have to hit tab to complete the word, but you can set readline up so that tab matches as far as possible or cycles through all words starting wit the current stub.

This seems to be a fairly decent introduction to readline in python http://www.doughellmann.com/PyMOTW/readline/index.html

Michael Anderson
A: 

This is adapted from J.Tauber's Trie implementation in Python, which you could compare and/or re-adapt with whatever extra features you need. See also the Wikipedia entry on tries.

class Trie:
    def __init__(self):
        self.root = [None, {}]

    def add(self, key):
        curr_node = self.root
        for ch in key:
            curr_node = curr_node[1].setdefault(ch, [key, {}])
        curr_node[0] = key

    def find(self, key):
        curr_node = self.root
        for ch in key:
            try:
                curr_node = curr_node[1][ch]
            except KeyError:
                return None
        return curr_node[0]

Setup (order of addition matters!):

t = Trie()
for word in [
   'lock',
   'read',
   'write',
   'request',
   'log']:
   t.add(word)

Then call like this:

>>> t.find('lo')
'lock'
>>> t.find('log')
'log'
>>> t.find('req')
'request'
>>> t.find('requiem')
>>>
Peter Hansen
Holy overkill, Batperson!
John Machin
No kidding, but yet another "iterate over list with startswith()" seemed *really* boring. ;-)
Peter Hansen
At least my stratswith effort addressed the OP's (pointless) efficiency concern by baling out upon the first match :-)
John Machin
@John, ah, so you think it's acceptable that if "lo" also appears in the list, but after "lock", that entering "lo" will return "lock" as the match? I did not think that.
Peter Hansen
@Peter Hansen: +1 *non* *sequitur* of the month. You have no reason for thinking that I think that that is acceptable. It's clearly not acceptable. However it's a natural consequence of what the OP said he wanted. He's obviously aware that order in the list matters. One would need to trust (a) that he wouldn't put "lo" after "lock" or "log" (b) failing that, that he would test his code and find out that a user could not access the "lo" functionality.
John Machin
@John, It can hardly be a "natural consequence", when my solution doesn't suffer from the problem. I'll also note that "bailing on the first match" while the overall solution is still slower because it's traversing a long list of commands is hardly a good way to address performance concerns. (Of course, we'd need more background on performance needs and actual list size before we'd accomplish anything by that discussion.)
Peter Hansen
@Peter: Your solution doesn't suffer from the problem either because of serendipitous accident or (more charitably) because you imposed an extra constraint on top of the OP's implicit left-to-right ordering. Baling out on the first match: you missed the smiley in my comment :-)
John Machin
A: 

If i understand your Q correctly, you want a snippet that will return the answer as soon as it has it, without traversing further through your 'command list.' This should do what you want:

from itertools import ifilter

def check_input(some_string, code_book) :
    for q in ifilter(code_book.__contains__, some_string) :
        return True
    return False
doug
+3  A: 

This is optimized at runtime like you requested... (although most likely not needed)

Here is a simple bit of code which will take an input dictionary of command mapped to function, and results in an output dictionary of all non-duplicate sub commands mapped to the same function.

So you run this when you start your service, and then you have 100% optimized lookups. I am sure there is a more clever way to do this, so feel free to edit.

commands = {
  'log': log_function,
  'exit': exit_function,
  'foo': foo_function,
  'line': line_function,
  }

cmap = {}
kill = set()
for command in commands:
  for pos in range(len(1,command)):
    subcommand = command[0:pos]
    if subcommand in cmap:
      kill.add(subcommand)
      del(cmap[subcommand])
    if subcommand not in kill:
      cmap[subcommand] = commands[command]

#cmap now is the following - notice the duplicate prefixes removed?
{
  'lo': log_function,
  'log': log_function,
  'e': exit_function,
  'ex': exit_function,
  'exi': exit_function,
  'exit': exit_function,
  'f' : foo_function,
  'fo' : foo_function,
  'foo' : foo_function,
  'li' : line_function,
  'lin' : line_function,
  'line' : line_function,
}
gahooa
A: 
import timeit

cmds = []
for i in range(1,10000):
    cmds.append("test")

def get_cmds(user_input):
    return [c for c in cmds if c.startswith(user_input)]

if __name__=='__main__':
    t = timeit.Timer("get_cmds('te')", "from __main__ import get_cmds")
    print "%0.3f seconds" % (t.timeit(number=1))

#>>> 0.008 seconds

So basically, per my comment, you're asking how to optimise an operation that takes no measurable time or CPU. I used 10,000 commands here and the test string matches every one just to show that even under extreme circumstances you could still have hundreds of users doing this and they would never see any lag.

SpliFF