views:

91

answers:

1

I have been unable to find this function in any of the standard packages, so I wrote the one below. Before throwing it toward the Cheeseshop, however, does anyone know of an already published version? Alternatively, please suggest any improvements. Thanks.

def fivenum(v):
    """Returns Tukey's five number summary (minimum, lower-hinge, median, upper-hinge, maximum) for the input vector, a list or array of numbers based on 1.5 times the interquartile distance"""
    import numpy as np
    from scipy.stats import scoreatpercentile
    try:
        np.sum(v)
    except TypeError:
        print('Error: you must provide a list or array of only numbers')
    q1 = scoreatpercentile(v,25)
    q3 = scoreatpercentile(v,75)
    iqd = q3-q1
    md = np.median(v)
    whisker = 1.5*iqd
    return np.min(v), md-whisker, md, md+whisker, np.max(v),
+5  A: 

I would get rid of these two things:

import numpy as np
from scipy.stats import scoreatpercentile

You should be importing at the module level. This means that users will be aware of missing dependencies as soon as they import your module, rather than when they call the function.

try:
    sum(v)
except TypeError:
    print('Error: you must provide a list or array of only numbers')

Several problems with this:

  1. Don't type check in Python. Document what the function takes.
  2. How do you know callers will see this? They might not be running at a console, and even if they are, they might not want your error message interfering with their output.
  3. Don't type check in Python.

If you do want to raise some sort of exception for invalid data (not type checking), either let an existing exception propagate, or wrap it in your own exception type.

detly
Good comments both. The imports are there just as a placeholder for when it will be a module. The exception handling I'll also take up. Thanks.
Richard Careaga
You already have a module (in python, all code is contained in a module). Just do your imports at the top level, outside the function. Not only is it arguably "more correct", but if/when you add another function to the file you won't have to write the import statements again.
Nathan Davis
It's not exactly right to refer to what's happening as type checking, just bad error reporting. The code leaves the client code free to call it with `v` equal to anything that can be passed to `sum`. That's completely correct.
aaronasterling
I use quite often delayed imports inside a function. In a larger package not everyone uses every function, so we don't need to import everything by default. I think this depends a lot on the context.