I have a series of data points (tuples) in a list with a format like:
points = [(1, 'a'), (2, 'b'), (2, 'a'), (3, 'd'), (4, 'c')]
The first item in each tuple is an integer and they are assured to be sorted. The second value in each tuple is an arbitrary string.
I need them grouped in lists by their first value in a series. So given an interval of 3, the above list would be broken into:
[['a', 'b', 'a', 'd'], ['c']]
I wrote the following function, which works fine on small data sets. However, it is inneficient for large inputs. Any tips on how to rewrite/optimize/mininize this so I can process large data sets?
def split_series(points, interval):
series = []
start = points[0][0]
finish = points[-1][0]
marker = start
next = start + interval
while marker <= finish:
series.append([point[1] for point in points if marker <= point[0] < next])
marker = next
next += interval
return series