I have a Python list with a number of entries, which I need to downsample using either:
- A maximum number of rows. For example, limiting a list of 1234 entries to 1000.
- A proportion of the original rows. For example, making the list 1/3 its original length.
(I need to be able to do both ways, but only one is used at a time).
I believe that for the maximum number of rows I can just calculate the proportion needed and pass that to the proportional downsizer:
def downsample_to_max(self, rows, max_rows):
return downsample_to_proportion(rows, max_rows / float(len(rows)))
...so I really only need one downsampling function. Any hints, please?
EDIT: The list contains objects, not numeric values so I do not need to interpolate. Dropping objects is fine.
SOLUTION:
def downsample_to_proportion(self, rows, proportion):
counter = 0.0
last_counter = None
results = []
for row in rows:
counter += proportion
if int(counter) != last_counter:
results.append(row)
last_counter = int(counter)
return results
Thanks.