ansaurus

Question

Better Way to Write This List Comprehension?

Answer 1

+6 A:

You can cut your field_breaks list in half by doing:

field_breaks = [0, 2, 10, 13, 21, 32, 43, ..., 250, 300]
s = ...
data = [s[x[0]:x[1]].strip() for x in zip(field_breaks[:-1], field_breaks[1:])]

dan04 2010-07-02 05:25:05

+1: Great idea to cut down on redundancy and risk of clerical errors. Combine this with Tomasz Wysocki's solution and it's perfect. Easy to read, too.

Tim Pietzcker 2010-07-02 06:14:55

Thanks! I'm going to go with this one because I like the idea of reducing the side of field_breaks.

Swingley 2010-07-02 15:08:41

Answer 2

+7 A:

You can use tuple unpacking for cleaner code:

data = [s[a:b].strip() for a,b in field_breaks]

Tomasz Wysocki 2010-07-02 05:53:04

+1, and this could be combined with dan04's idea as well (possibly using `pairwise` from the [`itertools` documentation](http://docs.python.org/library/itertools.html))

David Zaslavsky 2010-07-02 06:04:07

Answer 3

A:

Here is a way using map

data = map(s.__getslice__, *zip(*field_breaks))

gnibbler 2010-07-02 06:06:34

Answer 4

+3 A:

To be honest, I don't find the parse-by-column-number approach very readable, and I question its maintainability (off by one errors and the like). Though I'm sure the list comprehensions are very virtuous and efficient in this case, and the suggested zip-based solution has a nice functional tweak to it.

Instead, I'm going to throw softballs from out here in left field, since list comprehensions are supposed to be in part about making your code more declarative. For something completely different, consider the following approach based on the pyparsing module:

def Fixed(chars, width):
    return Word(chars, exact=width)

myDate = Combine(Fixed(nums,2) + Literal('-') + Fixed(alphas,3) + Literal('-')
                 + Fixed(nums,4))

fullRow = Fixed(nums,2) + Fixed(nums,8) + Fixed(alphas,3) + Fixed(alphas,8)
          + myDate + myDate + ...

data = fullRow.parseString(s)
# should be ['41', '00100297', 'LIC', 'ACTIVE  ', 
#            '09-JUN-1981', '31-DEC-2010', ...]

To make this even more declarative, you could name each of the fields as you come across them. I have no idea what the fields actually are, but something like:

someId = Fixed(nums,2)
someOtherId = Fixed(nums,8)
recordType = Fixed(alphas,3)
recordStatus = Fixed(alphas,8)
birthDate = myDate
issueDate = myDate
fullRow = someId + someOtherId + recordType + recordStatus
          + birthDate + issueDate + ...

Now an approach like this probably isn't going to break any land speed records. But, holy cow, wouldn't you find this easier to read and maintain?

Owen S. 2010-07-02 06:14:15

Very nice - all I would add would be a parse action to convert mydate to a Python datatime during parsing, and some results names, so that the values would be easily accessible post-parsing and the dates would already be usable as datetimes. (Fixed is a nice little helper, too.)

Paul McGuire 2010-07-10 06:57:10

ansaurus

tags:

views:

answers:

Better Way to Write This List Comprehension?

related questions