ansaurus

Question

Pythonic way to implement three similar integer range operators?

Answer 1

+2 A:

Now I am thinking of something such as:

def comparator(lop, rop):
    def comp(i, j, k):
        if j <= k:
            return lop(j, i) and rop(i,k)
        return lop(j, i) or rop(i,k)

    return comp

from operator import le, lt

inRange = comparator(le, lt)
inStrictRange = comparator(lt, lt)
inRange2 = comparator(lt, le)

Which looks better indeed.

NicDumZ 2009-06-12 06:55:07

That's a nice way indeed. I'd recommend the name "comparer" or some such though; "comparator" kind of indicates that the function is a device which compares, rather than just something that compares. Small matter, I know =)

Blixt 2009-06-12 07:05:24

Pretty classy indeed. Unfortunately, le and lt don't jump out at you as well as operator literals do.

Nikhil Chelliah 2009-06-12 07:14:35

Answer 2

+5 A:

The Pythonic way to do it is to choose readability, and therefor keep the 3 methods as they were at the beginning.

It's not like they are HUGE methods, or there are thousand of them, or you would have to dynamically generate them.

e-satis 2009-06-12 06:58:04

+1: Particularly since the docstrings are most of the text

James Hopkin 2009-06-12 08:11:19

maybe. But my question sort of begs for another approach, or at least suggestions :)

NicDumZ 2009-06-13 02:30:11

Answer 3

+4 A:

No higher-order functions, but it's less code, even with the extraneous else.

def exclusive(i, j, k):
    if j <= k:
        return j < i < k
    else:
        return j < i or i < k

def inclusive_left(i, j, k):
    return i==j or exclusive(i, j, k)

def inclusive_right(i, j, k):
    return i==k or exclusive(i, j, k)

I actually tried switching the identifiers to n, a, b, but the code began to look less cohesive. (My point: perfecting this code may not be a productive use of time.)

Nikhil Chelliah 2009-06-12 07:12:45

I like this I guess: concentrate on the Mathematical meaning rather than on code factoring. And /yes/*3 for the productivity remark :)

NicDumZ 2009-06-12 07:17:14

Answer 4

+7 A:

Two Zen of Python principles leap to mind:

Simple is better than complex.
There should be one—and preferably only one—obvious way to do it.

`range`

The Python built-in function range(start, end) generates a list from start to end.¹ The first element of that list is start, and the last element is end - 1.

There is no range_strict function or inclusive_range function. This was very awkward to me when I started in Python. ("I just want a list from a to b inclusive! How hard is that, Guido?") However, the convention used in calling the range function was simple and easy to remember, and the lack of multiple functions made it easy to remember exactly how to generate a range every time.

Recommendation

As you've probably guessed, my recommendation is to only create a function to test whether i is in the range [j, k). In fact, my recommendation is to keep only your existing inRange function.

(Since your question specifically mentions Pythonicity, I would recommend you name the function as in_range to better fit with the Python Style Guide.)

Justification

Why is this a good idea?

The single function is easy to understand. It is very easy to learn how to use it.

Of course, the same could be said for each of your three starting functions. So far so good.
There is only one function to learn. There are not three functions with necessarily similar names.

Given the similar names and behaviours of your three functions, it is somewhat possible that you will, at some point, use the wrong function. This is compounded by the fact that the functions return the same value except for edge cases, which could lead to a hard-to-find off-by-one bug. By only making one function available, you know you will not make such a mistake.
The function is easy to edit.

It is unlikely that you'll need to ever debug or edit such an easy piece of code. However, should you need to do so, you need only edit this one function. With your original three functions, you have to make the same edit in three places. With your revised code in your self-answer, the code is made slightly less intuitive by the operator obfuscation.
The "size" of the range is obvious.

For a given ring where you would use inRange(i, j, k), it is obvious how many elements would be covered by the range [j, k). Here it is in code.
```
if j <= k:
    size = k - j
if j > k:
    size = k - j + MAX
```
So therefore
```
size = (k - j) % MAX
```

Caveats

I'm approaching this problem from a completely generic point of view, such as that of a person writing a function for a publicly-released library. Since I don't know your problem domain, I can't say whether this is a practical solution.

Using this solution may mean a fair bit of refactoring of the code that calls these functions. Look through this code to see if editing it is prohibitively difficult or tedious.

¹: Actually, it is range([start], end, [step]). I trust you get what I mean though.

Wesley 2009-06-12 08:39:00

Wow. Thanks for taking the time to write that answer. But afaik, range() generates a list. And I don't really want to generate a list for each generic comparison: we are talking of generating a list of size O(n), and doing a O(n) operation (_contains__) instead of my three arithmetic operations. It's a very important overhead :)

NicDumZ 2009-06-12 10:29:25

I don't mean to say that you should generate a range for your comparison. I'm saying that you should consider why there is only one `range` function, and then consider only writing one `inRange` function for the same reasons. Your `inRange` function is obviously O(1). However, you should consider omitting the other two functions entirely. If you suddenly need an inclusive `inRange` for *x* between *a* and *b*, you have `inRange(x, a, b+1)`, which is also O(1).

Wesley 2009-06-12 21:41:55

Right, sorry for reading too fast through your answer then. I certainly agree with your suggestion then. in_range2(x, lo, hi) can be obtained through `not in_range(x, hi, lo)`

NicDumZ 2009-06-13 02:39:09

For clarity's sake, I would rewrite a call to `in_range2(x, lo, hi)` as `in_range(x, lo + 1, hi + 1)`. After all, the goal of being pythonic is being easy to understand. Admittedly, adding one to the bounds is not terribly clear either, but it's a little easier to work out with time and a pencil.

Wesley 2009-06-13 03:45:01

If you don't want to generate a list, use xrange() instead. It uses yield http://stackoverflow.com/questions/231767/can-somebody-explain-me-the-python-yield-statement/231855#231855) and generate the values on the fly.

e-satis 2009-06-13 08:24:54

Just to be clear, I'll respond again: the solution I've recommended does not use the `range` function. I merely point to the `range` function as a model for designing the `inRange` function.

Wesley 2009-06-13 10:44:48

Answer 5

+1 A:

To make it more familiar to your users, I would have one main in_range function with the same bounds as range(). This makes it much easier to remember, and has other nice properties as Wesley mentioned.

def in_range(i, j, k):
    return (j <= i < k) if j <= k else (j <= i or i < k)

You can certainly use this one alone for all your use cases by adding 1 to j and/or k. If you find that you're using a specific form frequently, then you can define it in terms of the main one:

def exclusive(i, j, k):
    """Excludes both endpoints."""
    return in_range(i, j + 1, k)

def inclusive(i, j, k):
    """Includes both endpoints."""
    return in_range(i, j, k + 1)

def weird(i, j, k):
    """Excludes the left endpoint but includes the right endpoint."""
    return in_range(i, j + 1, k + 1)

This is shorter than mucking around with operators, and is also much less confusing to understand. Also, note that you should use underscores instead of camelCase for function names in Python.

Kiv 2009-06-12 22:42:50

+1 for the CamelCase remark, and for reminding me that for naturals "i in [lo, hi[" <=> 'i in [lo, hi+1]". It's however a bit unperfect: you need to include MAX modulos, as my ranges are cyclic :)

NicDumZ 2009-06-13 02:25:51

Answer 6

+1 A:

I'd go one step further than Wesley in aping the normal python 'in range' idiom; i'd write a cyclic_range class:

import itertools

MAX = 10 # or whatever

class cyclic_range(object):
 def __init__(self, start, stop):
  # mod so you can be a bit sloppy with indices, plus -1 means the last element, as with list indices
  self.start = start % MAX
  self.stop = stop % MAX
 def __len__(self):
  return (self.stop - self.start) % MAX
 def __getitem__(self, i):
  return (self.start + i) % MAX
 def __contains__(self, x):
  if (self.start < self.stop):
   return (x >= self.start) and (x < self.stop)
  else:
   return (x >= self.start) or (x < self.stop)
 def __iter__(self):
  for i in xrange(len(self)):
   yield self[i]
 def __eq__(self, other):
  if (len(self) != len(other)): return False
  for a, b in itertools.izip(self, other):
   if (a != b): return False
  return True
 def __hash__(self):
  return (self.start << 1) + self.stop
 def __str__(self):
  return str(list(self))
 def __repr__(self):
  return "cyclic_range(" + str(self.start) + ", " + str(self.stop) + ")"
 # and whatever other list-like methods you fancy

You can then write code like:

if (myIndex in cyclic_range(firstNode, stopNode)):
 blah

To do the equivalent of inRange. To do inStrictRange, write:

if (myIndex in cyclic_range(firstNode + 1, stopNode)):

And to do inRange2:

if (myIndex in cyclic_range(firstNode + 1, stopNode + 1)):

If you don't like doing the additions by hand, how about adding these methods:

 def strict(self):
  return cyclic_range(self.start + 1, self.stop)
 def right_closed(self):
  return cyclic_range(self.start + 1, self.stop + 1)

And then doing:

if (myIndex in cyclic_range(firstNode, stopNode).strict()): # inStrictRange
if (myIndex in cyclic_range(firstNode, stopNode).closed_right()): # inRange2

Whilst this approach is, IMHO, more readable, it does involve doing an allocation, rather than just a function call, which is more expensive - although still O(1). But then if you really cared about performance, you wouldn't be using python!

Tom Anderson 2009-06-12 22:58:17

On a philosophic aspect, I disagree with "if you really cared about performance, you wouldn't be using python!". I like Python for its readability. It's great to prototype implementations and improve algorithms (algorithms, not implementations). Then, if you need something fast, porting it to another language is not so hard.

NicDumZ 2009-06-13 02:21:53

Interesting approach too, thanks :) A small remark: __hash__ is fast, but very collusive: ~70% of collisions. That might be unwanted :)

NicDumZ 2009-06-13 02:28:23

Answer 7

+2 A:

I certainly agree that you need only one function, and that the function should use a (Pythonic) half-open range.

Two suggestions:

Use meaningful names for the args: in_range(x, lo, hi) is a big improvement relative to the 2-keystroke cost.
Document the fact that the constraint hi < MAX means that it is not possible to express a range that includes all MAX elements. As Wesley remarked, size = (k - j) % MAX i.e. size = (hi - lo) % MAX and thus 0 <= size < MAX.

John Machin 2009-06-13 00:39:29

Two very useful points, thanks. For most of the answers, there is also a problem with a range of size 0. But fortunately enough, this is not in my scope today :)

NicDumZ 2009-06-13 01:59:08

ansaurus

tags:

views:

answers:

Pythonic way to implement three similar integer range operators?

`range`

Recommendation

Justification

Caveats

related questions