views:

206

answers:

4

First of all python is an awesome language. This is my first project using python and I've made a ridiculous amount of progress already.

There's no way that this code below is the best way to do this. What's the most idiomatic way write a class definition?

class Course:

    crn =  course =  title =  tipe =  cr_hours =  seats =  instructor =  days =  begin =  end = location = exam = ""

    def __init__(self, pyQueryRow):
        self.crn = Course.get_column(pyQueryRow, 0)
        self.course = Course.get_column(pyQueryRow, 1)
        self.title = Course.get_column(pyQueryRow, 2)
        self.tipe = Course.get_column(pyQueryRow, 3)
        self.cr_hours = Course.get_column(pyQueryRow, 4)
        self.seats = Course.get_column(pyQueryRow, 5)
        self.instructor = Course.get_column(pyQueryRow, 6)
        self.days = Course.get_column(pyQueryRow, 7)
        self.begin = Course.get_column(pyQueryRow, 8)
        self.end = Course.get_column(pyQueryRow, 9)
        self.location = Course.get_column(pyQueryRow, 10)
        self.exam = Course.get_column(pyQueryRow, 11)

    def get_column(row, index):
        return row.find('td').eq(index).text()

Thanks!

+2  A: 

EDIT: Actually, the best might be:

self.crn, self.course, self.title, self.tipe, self.cr_hours, self.seats,\ 
self.instructor, self.days, self.begin, self.end, self.location, self.exam = \ 
[pq(td).text() for td in pyQueryRow.find('td')]

That assumes you've imported PyQuery as pq. This avoids ever using indices at all.


self.crn, self.course, self.title, self.tipe, self.cr_hours, self.seats,\ 
self.instructor, self.days, self.begin, self.end, self.location, self.exam = \
map(lambda index: get_column(pyQueryRow, index), xrange(0, 12))

or if you want a list comprehension:

self.crn, self.course, self.title, self.tipe, self.cr_hours, self.seats,\ 
self.instructor, self.days, self.begin, self.end, self.location, self.exam = \
[get_column(pyQueryRow, index) for index in xrange(0, 12)]

I don't know if these are the most idiomatic, but there's definitely less boilerplate.

Also, remove the crn = course =. You're assigning to the class, not the instance.

Matthew Flaschen
Yeah I thought about a similar solution but not quite as elegant as yours (list comprehension instead of map (which is dumb when you consider map is defined in terms of a list comprehension)).
Tyler
Thanks!!!!!(15)
Tyler
I like the lambda idea, but I don't think this is actually more readable, because it's hard to see which index gets mapped to which field. Imagine if you had to add one in the middle somewhere - it would be easy to make a mistake.
Evgeny
@Evgeny, I see what you mean. But since it's based on scraping an HTML page, if one is added in the middle, the others will shift down. You just have to put it between the correct two and increment the maximum.
Matthew Flaschen
+2  A: 

I'm not sure that there is a "better" way. What you have is certainly quite readable. If you wanted to avoid duplicating the Course.get_column code you could define a lambda for that, as in Matthew Flaschen's answer, eg.

class Course:
    def __init__(self, pyQueryRow):
        get_column = lambda index: pyQueryRow.find('td').eq(index).text()

        self.crn = get_column(0)
        self.course = get_column(1)
        self.title = get_column(2)
        self.tipe = get_column(3)
        self.cr_hours = get_column(4)
        self.seats = get_column(5)
        self.instructor = get_column(6)
        self.days = get_column(7)
        self.begin = get_column(8)
        self.end = get_column(9)
        self.location = get_column(10)
        self.exam = get_column(11)

Note that you don't need the line that initialises all the fields to "" beforehand - just setting them in __init__ is fine. Edit: in fact, as Matthew says, that sets class fields, not instance fields - I totally missed that.

Evgeny
+12  A: 
def__init__(self, pyQueryRow):
    for i,attr in enumerate("crn course title tipe cr_hours seats instructor"
                            " days begin end location exam".split()):
        setattr(self, attr, self.get_column(pyQueryRow, i))

This way avoids multiple calls to self.get_column

def__init__(self, pyQueryRow):
    attrs = ("crn course title tipe cr_hours seats instructor"
             " days begin end location exam".split())
    values = [td.text for td in pyQueryRow.find('td')]
    for attr, value in zip(attrs, values):
        setattr(self, attr, value)
gnibbler
Isn't this risky? What if you misspell a member in that string?
Assaf Lavie
@Assaf Lavie, Same as if you mistype an attribute name in your code. Either way Python won't complain until you try to access an attribute that doesn't exist. Normally you should have unit tests that would catch those types of errors
gnibbler
+1  A: 

Personally, I'd use a dictionary to map the property to column numbers:

class Course:

    crn =  course =  title =  tipe =  cr_hours =  seats =  instructor =  days =  begin =  end = location = exam = ""

    def __init__(self, pyQueryRow):
        course_row_mapping = {
            'crn' : 0,
            'course' : 1,
            'title' : 2,
            'tipe' : 3, # You probably mean "type"?
            'cr_hours' : 4,
            'seats' : 5,
            'instructor' : 6,
            'days' : 7,
            'begin' : 8,
            'end' : 9,
            'location' : 10,
            'exam' : 11,
        }   

        for name, col in course_row_mapping.iteritems():
            setattr(self, name, Course.get_column(pyQueryRow, col))

    def get_column(row, index):
        return row.find('td').eq(index).text()
sdolan