ansaurus

Question

Answer 1

A:

What exactly are you looking for as your end result (could you give us an example grouping)? If your only goal is for all groups to have a significant number of important enough products, then, even if you come up with the perfect algorithm that works for your current data set that does not mean it will work with tomorrow's dataset. Depending on the number of sets of groups you need I would simply make arbitrary groups that fit your needs instead of using an algorithm. Ex. ($1 - $25, $25-100, $100+). From a consumer's perspective my mind naturally distributes products into 3 difference price categories (cheap, midrange and expensive).

Justin Lucas 2010-07-21 22:54:03

Answer 2

A:

I think you're thinking too much.

If you know your products, and you like fine grained results, I would simply hard code those price ranges. If you think $1 to $10 makes sense for what you are selling, put it in, you don't need an algorithm. Just do a check so that you only show ranges that have results.

If you don't know your products, I would just sort all the products by price, and divide it into 4 groups of equal number of products.

nute 2010-07-21 22:55:59

Answer 3

+1 A:

Here is an idea, following the line of thought of my comment:

I assume you have a set of products, each of them tagged by a price and a sales volume estimate (as a percent from the total sales). First, sort all products by their price. Next, start splitting: traverse the ordered list, and accumulate sales volume. Each time you reach about 25%, cut there. If you do so 3 times, it will result in 4 subsets having disjoint price ranges, and a similar sales volume.

Eyal Schneider 2010-07-21 23:02:54

Answer 4

A:

Here is an idea: basically you would sort the price into buckets of 10, each price as the key in the array, the value is a count of how many products are at the given price point:

public function priceBuckets($prices)
{    
    sort($prices);

    $buckets = array(array());
    $a = 0;

    $c = count($prices);
    for($i = 0; $i !== $c; ++$i) {
        if(count($buckets[$a]) === 10) {
            ++$a;
            $buckets[$a] = array();
        }

        if(isset($buckets[$a][$prices[$i]])) {
            ++$buckets[$a][$prices[$i]];
        } else if(isset($buckets[$a - 1][$prices[$i]])) {
            ++$buckets[$a - 1][$prices[$i]];
        } else {
            $buckets[$a][$prices[$i]] = 1;
        }
    }

    return $buckets;
}

//TEST CODE
$prices = array();

for($i = 0; $i !== 50; ++$i) {
    $prices[] = rand(1, 100);
}
var_dump(priceBuckets($prices));

From the result, you can use reset and end to get the min/max of each bucket

Kinda brute force, but might be useful...

tsgrasser 2010-07-21 23:14:24

This is similar to my approach where I took the quartiles, except you chose 10 groups rather than 4. I think this is one of the most promising approaches, my only problem being that it results in odd price ranges, even if they are a good representation of the data. In other words, I might end up with prices ranges like $15.47 to $152.87. Each bucket might have an even distribution, but the price boundaries are arbitrary and confusing.

Dave W. 2010-07-22 00:40:38

ansaurus

tags:

views:

answers:

Price Filter Grouping Algorithm

related questions