views:

199

answers:

3

suppose I have a python list or a python 1-d array (represented in numpy). assume that there is a contiguous stretch of elements how can I find the start and end coordinates (i.e. indices) of the stretch of non-zeros in this list or array? for example,

a = [0, 0, 0, 0, 1, 2, 3, 4]

nonzero_coords(a) should return [4, 7]. for:

b = [1, 2, 3, 4, 0, 0]

nonzero_coords(b) should return [0, 2].

thanks.

+1  A: 

Actually, nonzero_coords(b) should return [0, 3]. Can multiple holes occur at the input? If yes, what to do then? The naive solution: scan until first non-zero el. Then scan until the last non-zero el. Code is below (sorry did not test it):

a = [0, 0, 0, 0, 1, 2, 3, 4, 5, 0, 0, 0]
start = 0
size = len(a) # 
while (start < size and a[start] != 0): start += 1
end = start
while (end < size and a[end] != 0): end += 1
return (start, end)
Hamish Grubijan
+2  A: 

Assuming there's a single continuous stretch of nonzero elements...

x = nonzero(a)[0]
result = [x[0], x[-1]]
tom10
Yeah, I think that about settles it. Makes my answer look a little silly by comparison...
Peter Milley
This fails with multiple holes. The author was not specific enough. Also, while this seems to be the easiest solution, I am not sure if it is the speediest one.
Hamish Grubijan
@Hamish - Certainly this will be much faster than a pure Python solution since here the loop over the array runs in C and not in Python. Also, it's not really correct to call this a fail when the OP doesn't mention how to treat zeros within the chain, so the best one can do is state the assumptions of a given approach, which I clearly do. (Your answer, btw, just picks the first continuous chain, which, a priori, certainly isn't any better, but it fails to mention this limitation.)
tom10
It's cool, bro.
Hamish Grubijan
@Hamish - Sure... no problem here. I'm just trying to address your points directly. The solution I presented is, from my experience anyway, a good solution to this problem; but your criticism confused this point, so I was just trying to explain why "nonzero" is still a good solution. For people who don't have experience with a particular approach, criticism such as yours can inappropriately diminish their interest in trying it.
tom10
A: 

If you've got numpy loaded anyway, go with tom10's answer.

If for some reason you want something that works without loading numpy (can't imagine why, to be honest) then I'd suggest something like this:

from itertools import groupby

def nonzero_coords(iterable):
  start = 0
  for iszero, sublist in groupby(iterable, lambda x:x==0):
    if iszero:
      start += len(list(sublist))
    else:
      return start, start+len(list(sublist))-1
Peter Milley