views:

385

answers:

6

My program needs to do 2 things.

  1. Extract stuff from a webpage.

  2. Do stuff with a webpage.

However, there are many webpages, such as Twitter and Facebook.

should I do this?

def facebookExtract():
    code here
def twitterExtract():
    code here
def myspaceExtract():
    code here
def facebookProcess():
    code here
def twitterProcess():
    code here
def myspaceProcess():
    code here

Or, should I have some sort of class? When is it recommended to use classes, and when is it recommend to just use functions?

+1  A: 

Put as much of the common stuff together in a single function. Once you've factored as much out as possible, build a mechanism for branching to the appropriate function for each website.

One possible way to do this is with python's if/else clauses, but if you have many such functions, you may want something more elegant such as

F = __import__('yourproject.facebookmodule')

This lets you put the code that's specific for facebook in it's own area. Since you pass __import__() a string, you can modify that at runtime based on which site you're accessing, and then just call function F in your generic worker code.

More on that here: http://effbot.org/zone/import-confusion.htm

Paul McMillan
+1  A: 

You use OOP when it makes sense, when it makes developing the solution quicker and when it makes the end result easier to read, understand and maintain.

In this case it might make sense to create a generic Extractor interface/class and then have subclasses for Twitter, MySpace, Facebook, etc but this really depends on how site-specific the extraction is. The idea of this kind of abstraction is to hide such details. If you can do it, it makes sense. If you can't you probably need a different approach.

It may also be that similar benefits can be obtained from good decomposition of a procedural solution.

Remember at the end of the day that all these things are just tools. Pick the best one for that particular job rather than picking the hammer and then trying to turn everything into a nail.

cletus
+1  A: 

It's up to you. I personally try to stay away from Java-style classes when programming in python. Instead, I use dicts and/or simple objects.

For instance, after defining these functions (the ones you defined in the question), I'd create a simple dict, maybe like this:

{ 'facebook' : { 'process' : facebookProcess, 'extract': facebookExtract }, 
 ..... 
}

or, better yet, use introspection to get the process/extract function automatically:

def processor(sitename):
    return getattr(module, sitename + 'Process')

def extractor(sitename):
    return getattr(module, sitename + 'Extractor')

Where module is the current module (or the module that has these functions).

To get this module as an object:

import sys
module = sys.modules[__name__]

Assuming of course, that the generic main function does something like this:

    figure out sitename based on input.
    get the extractor function for the site
    get processor function for the site
    call the extractor then the processor
hasen j
+5  A: 

My favorite rule of thumb: if you're in doubt (unspoken assumption: "and you're a reasonable person rather than a fanatic";-), make and use some classes. I've often found myself refactoring code originally written as simple functions into classes -- for example, any time the simple functions' best way to communicating with each others is with globals, that's a code smell, a strong hint that the system's factoring is not really good -- and often refactoring the OOP way is a reasonable fix for that.

Python is multi-paradigm, but its central paradigm is OOP (much like, say, C++). When a procedural or functional approach (maybe through generators &c) is optimal for some part of the system, that generally stands out -- for example, static functions are also a code smell, and if your classes have any substantial amount of those THAT is a hint to refactor things to avoid that requirement.

So, assuming you have a rich grasp of all the paradigms Python affords -- if you're STILL in doubt, that suggests you probably want to go OOP for that part of your system! Just because Python supports OOP even more wholly than it supports functional programming and the like.

From your very skeletal code, it seems to me that each extract/process pair belongs together and probably needs to communicate state, so a small set of classes with extraction and processing methods seems a natural fit.

Alex Martelli
Fanatacism is just fine; as long as I don't have to listen to it ;)
Matthew Scharley
everything in python is an object; including functions. also, premature abstraction is the root of all evil.
hasen j
Not sure what "fanaticism" (hope fixing your broken spelling doesn't also qualify?!-) you're referring to, o @Matthew -- I'm a paladin of multi-paradigm programming, I just don't believe in fighting City Hall. So, _when in doubt_ among paradigms, in Python I choose OOP, just as in O'Caml I choose functional -- they're both multi-paradigms languages, but it's obvious that O'Caml is _primarily_ functional (with some OOP tacked on) while Python is _mostly_ OOP (with some functional tacked on). What's fanatical about that?!
Alex Martelli
@hasen, the `if` statement (for example) is NOT an object in Python; that's very different from Smalltalk, where there's no "if statement", but rather, `if` is a method of Booleans, and therefore IS an object. Smalltalk is *fanatically* OOP (you'd have to be crazy to program in Smalltalk by any other paradigm!), Python only *pragmatically* so (there's plenty of cases where non-OOP approaches, especially FP, are clearly preferable in Python). WRT abstraction, cfr http://us.pycon.org/2009/conference/schedule/event/75/ -- I think I discussed the issue reasonably well there;-).
Alex Martelli
@Alex: my comment was in response to your assumption that you noted, not your post in general.
Matthew Scharley
@Matthew, ah, got it, tx. Was hard to tell from context;-). Actually I find listening to fanatics on an occasional basis to be an important reminder of what an impossible task it is to make an API or other design "fool-proof" -- fools are just **too** ingenious!-)
Alex Martelli
@Alex, my point is: there's no need to turn everything into a class, specially when python gives you many other/better alternatives.
hasen j
>>premature abstraction is the root of all evil>> :-)
foosion
@hasen, of course there's no need to turn everything into a class (especially as python _doesn't_ give you any real way to turn `if` into a class or method, differently from Smalltalk!-): MY point (since you appear to not have read it, let me repeat) is "Python is multi-paradigm, but its central paradigm is OOP" -- so, assuming you have a rich grasp of all the paradigms, if you're still in doubt defaulting to OOP is the safe choice (just like in O'Caml it would be defaulting to FP).
Alex Martelli
+1  A: 
memnoch_proxy
+2  A: 

"My program needs to do 2 things."

When you start out like that, the objects cannot be seen. You're perspective isn't right.

Change your thinking.

"My program works with stuff"

That's OO thinking. What "stuff" does your program work with? Define the stuff. Those are your basic classes. There's a class for each kind of stuff.

"My program gets the stuff from various sources"

There's a class for each source.

"My program displays the stuff"

This is usually a combination of accessor methods of the stuff plus some "reporting" classes that gather parts of the stuff to display it.

When you start out defining the "stuff" not the "do", you're doing OO programming. OO applies to everything, since every single program involves "doing" and "stuff". You can chose the "doing" POV (which is can be procedural or functional), or you can chose the "stuff" POV (which is object-oriented.)

S.Lott