ansaurus

Question

Answer 1

+3 A:

The regex 'S(.{2,6}?)N' will give you only matches with length 2 - 6 characters.

To return the shortest matching substring, use sorted(s1, key=len)[0].

Full example:

import re
p=re.compile('S(.{2,6}?)N')
s='ASDFANSAAAAAFGNDASMPRKYNSAAN'
s1=p.findall(s)
if s1:
    print sorted(s1, key=len)[0]
    print min(s1, key=len) # as suggested by Nick Presta

This works by sorting the list returned by findall by length, then returning the first item in the sorted list.

Edit: Nick Presta's answer is more elegant, I was not aware that min also could take a key argument...

codeape 2009-04-27 06:13:47

Answer 2

+8 A:

If you already have the list, you can use the min function with the len function as the second argument.

>>> s1 = ['DFA', 'AAAAAFG', 'MPRKY']
>>> min(s1, key=len)
'DFA'

EDIT:
In the event that two are the same length, you can extend this further to produce a list containing the elements that are all the same length:

>>> s2 = ['foo', 'bar', 'baz', 'spam', 'eggs', 'knight']
>>> s2_min_len = len(min(s2, key=len))
>>> [e for e in s2 if len(e) is s2_min_len]
['foo', 'bar', 'baz']

The above should work when there is only 1 'shortest' element too.

EDIT 2: Just to be complete, it should be faster, at least according to my simple tests, to compute the length of the shortest element and use that in the list comprehension. Updated above.

Nick Presta 2009-04-27 06:24:32

+1 Far more elegant than my sorted()[0] solution...

codeape 2009-04-27 06:35:41

ansaurus

tags:

views:

answers:

Find shortest substring

related questions