ansaurus

Question

Finding a Value within a Range in a List of Tuple Values in Python

Answer 1

+2 A:

# bmi = <whatever>
found_bmi_range = [bmi_range for bmi_range
                   in bmi_ranges
                   if bmi_ranges[2] <= bmi <= bmi_ranges[3]
                  ][0]

You can add if clauses to list comprehensions that filter what items are included in the result.

Note: you may want to adjust your range specifications to use a non-inclusive upper bound (i.e. [a,b) + [b,c) + [c,d) et cetera), and then change the conditional to a <= b < c, that way you don't have issues with edge cases.

Amber 2010-09-25 18:43:08

And if you really care about performance you may use binary-search tree to reduce number of comparisons. But since since OP have sql-db it would make same thing with proper indexes.

Andrew 2010-09-25 18:55:09

bmi = 29.9950000 ?

eumiro 2010-09-25 19:05:46

@eumiro - flaw in the original data; could easily be adapted to `bmi_ranges[2] <= bmi < bmi_ranges[3]` if the original data were specified as a `[x,y)` type of range.

Amber 2010-09-25 19:08:16

@Amber, the OP is open to any other data structure, so this might be a good hint not to use those .99 limit values. My answer uses only one value to limit the ranges. Your list comprehension would have to be little bit more complicated to take the minValue from the next range.

eumiro 2010-09-25 19:11:48

Thanks - yes, my ranges would not allow more decimal places, but BMI standards usually use only 1-2 decimal places anyway so I could round in the assignment of BMI. I would be interested in seeing how this would work with only upper or lower ranges, though (the bisect solution is much, much slower than the list comprehension, @eumiro).

Jough Dempsey 2010-09-25 19:33:53

Answer 2

+1 A:

You can do this with a list comprehension:

>>> result = [r for r in bmi_ranges if r[2] <= 32 <= r[3]]
>>> print result
[(u'Obese', u'Obese Class I', 30.0, 34.99)]

However it would probably be faster to request the database to do this for you as otherwise you are requesting more data than you need. I don't understand how using a BETWEEN requires using one more data connection. If you could expand on that it would be useful. Are you talking about the pros and cons of caching data versus always asking for live data?

You may also want to create a class for your data so that you don't have to refer to fields as x[2], but instead can use more meaningful names. You could also look at namedtuples.

Mark Byers 2010-09-25 18:46:10

Probably not faster to do a trip to the database to search through only 8 ranges...

Amber 2010-09-25 18:48:14

The roundtrip might be the most expensive part.

Mark Byers 2010-09-25 18:50:57

...which is all the more reason to eliminate the roundtrip entirely.

Amber 2010-09-25 18:53:01

@Amber: If you're fetching the data from the database anyway you should use BETWEEN, if you're not then you are talking about caching rather than the relative speed of each query. Caching has pros but also cons.

Mark Byers 2010-09-25 18:54:56

@Mark: The list of ranges might very well be constant, in which case it's not caching at all, but whether you're talking to a DB or not, period, if the BMI info is coming from the user. (It may not be, but it's a perfectly imaginable scenario.)

Amber 2010-09-25 19:05:33

bmi = 29.9950000 ?

eumiro 2010-09-25 19:06:10

@Amber: The OP says he is using a database and he says the reason he is not using BETWEEN is because that would require an extra connection.

Mark Byers 2010-09-25 19:13:59

@Mark: The OP doesn't actually state what they're using a DB for currently - only that they *could* use a DB.

Amber 2010-09-25 19:22:58

@Amber: Oh, I see. Thanks. OK but now assuming that you are right, I don't at all see the purpose of the unnamed tuple instead of using a class...

Mark Byers 2010-09-25 19:24:37

@Mark: Probably just the first thing that came to the OP's mind. Dicts, named tuples, or a class would work as well.

Amber 2010-09-25 19:58:42

Answer 3

A:

I'm not sure if I understand why you can't do this just by iterating over the list (I know there are more efficient datastructures, but this is very short and iteration would be more understandable). What's wrong with

def check_bmi(bmi, bmi_range):
    for cls, name, a, b in bmi_range:
        if a <= bmi <= b:
            return cls # or name or whatever you need.

wxs 2010-09-25 18:46:38

Er, did you mean `a <= bmi <= b` ?

Amber 2010-09-25 18:47:24

bmi = 29.9950000 ?

eumiro 2010-09-25 19:05:13

I was iterating, but it seemed like a naive way of getting there and I thought I was closer to the "right" way to do it with the listcomp. This solution would be far less attractive were the dataset larger, but BMI ranges are a standard and there aren't that many values, which is why I wanted to avoid DB overhead to begin with.

Jough Dempsey 2010-09-25 19:39:57

Ah right amber. And eumiro, if the bmi is not in one of the given ranges it will return None.

wxs 2010-09-25 19:54:08

Answer 4

A:

zchtodd 2010-09-25 18:57:05

bmi = 29.9950000 ?

eumiro 2010-09-25 19:06:33

Answer 5

A:

If you like a lighter original data structure and one import from standard library:

import bisect

bmi_ranges = []
bmi_ranges.append((u'Underweight', u'Severe Thinness', 0, 15.99))
bmi_ranges.append((u'Underweight', u'Moderate Thinness', 16.00, 16.99))
bmi_ranges.append((u'Underweight', u'Mild Thinness', 17.00, 18.49))
bmi_ranges.append((u'Normal Range', u'Normal Range', 18.50, 24.99))
bmi_ranges.append((u'Overweight', u'Overweight', 25.00, 29.99))
bmi_ranges.append((u'Obese', u'Obese Class I', 30.00, 34.99))
bmi_ranges.append((u'Obese', u'Obese Class II', 35.00, 39.99))
bmi_ranges.append((u'Obese', u'Obese Class III', 40.00, 1000.00))

# we take just the minimal value for BMI for each class
# find the limit values between ranges:

limitValues = [line[2] for line in bmi_range][1:]
# limitValues = [16.0, 17.0, 18.5, 25.0, 30.0, 35.0, 40.0]

# bisect.bisect(list, value) returns the range
#in the list, in which value belongs
bmi_range = bmi_ranges[bisect.bisect(limitValues, bmi)]

More information: bisect

eumiro 2010-09-25 18:59:46

This seems overly complex (especially compared with the list comprehension solutions above) and less Pythonic, but it's interesting and may be effective with a larger dataset.

Jough Dempsey 2010-09-25 19:36:35

Answer 6

A:

The builtin filter function exists for this purpose:

bmi = 26.2
answer = filter(lambda T, : T[2]<=bmi<=T[3], bmi_ranges)[0]
print answer
>>> (u'Overweight', u'Overweight', 25.0, 29.989999999999998)

Hope this helps

inspectorG4dget 2010-09-25 19:01:26

bmi = 29.9950000 ?

eumiro 2010-09-25 19:04:50

Using the `if` clause in a list comprehension is the preferred way of doing this now; filter remains available but isn't the preferred method.

Amber 2010-09-25 19:07:25

@eumiro: 29.995 will not fall any range, because of the way @JoughDempsey made the range brackets. 29.995 > 29.99

inspectorG4dget 2010-09-25 20:19:22

@Amber: Can you please explain why the list comprehension's if statement is preferred to filter?

inspectorG4dget 2010-09-25 20:20:10

@inspector: It's considered more Pythonic and easier to read. It can also create a generator instead of a list for lazy evaluation, if so desired.

Amber 2010-09-25 20:28:52

Answer 7

A:

This is how I would deal with it:

import random

bmi_ranges = [(u'Underweight', u'Severe Thinness', 16.0),
               (u'Underweight', u'Moderate Thinness', 17.0),
               (u'Underweight', u'Mild Thinness', 18.5),
               (u'Normal Range', u'Normal Range', 25.0),
               (u'Overweight', u'Overweight', 30.0),
               (u'Obese', u'Obese Class I', 35.0),
               (u'Obese', u'Obese Class II', 40.0),
               (u'Obese', u'Obese Class III', 1000.0)]

def bmi_lookup(bmi_value):
    return next((classification, description, lessthan)
         for classification, description, lessthan in bmi_ranges
         if bmi_value < lessthan)

for bmi in range(20):
    random_bmi = random.random()*50
    print random_bmi, bmi_lookup(random_bmi)

Tony Veijalainen 2010-09-25 20:29:09

ansaurus

tags:

views:

answers:

Finding a Value within a Range in a List of Tuple Values in Python

related questions