ansaurus

Question

Code Golf: Shortest code to find a weighted median?

Answer 1

A:

Just a comment about your code : I really hope I will not have to maintain it, unless you also wrote all the unit tests that are required here :-)

It is not related to your question of course, but usually, the "shortest way to code" is also the "hardest way to maintain". For scientific applications, it is probably not a show stopper. But for IT applications, it is.

I think it has to be said. All the best.

Sylvain 2009-06-08 20:49:41

Sorry for misunderstanding. I wrote a long answer to somebody else's question (my answer is also linked now) and tried to write a very readable code. But now I'm thinking about code golfing, so I am also trying to produce a minimum version.

ilya n. 2009-06-08 20:55:43

@Sylvain: but it's code golf!

Nosredna 2009-06-08 20:56:01

@Sylvain - the point of code golf isn't to produce production-quality code. It's more like a brain-teaser - fun and challenging, but not to be used to real life projects.

Erik Forbes 2009-06-08 21:42:42

You are right, misplaced answer. Sorry Ilya... :o)

Sylvain 2009-06-09 12:36:21

Answer 2

A:

Something like this? O(n) running time.

for(int i = 0; i < x.length; i++)
{
sum += x[i] * w[i];
sums.push(sum);
}

median = sum/2;

for(int i = 0; i < array.length - 1; i++)
{
    if(median > sums[element] and median < sums[element+1]
         return x[i];
    if(median == sums[element])
         return (x[i] + x[i+1])/2
}

Not sure how you can get two answers for the median, do you mean if sum/2 is exactly equal to a boundary?

EDIT: After looking at your formatted code, my code does essentially the same thing, did you want a MORE efficient method?

EDIT2: The search part can be done using a modified binary search, that would make it slightly faster.

index = sums.length /2;
finalIndex = binarySearch(index);

int binarySearch(i)
{
    if(median > sums[i+1])
    {
        i += i/2
        return binarySearch(i);
    }
    else if(median < sums[i])
    {
        i -= i/2
        return binarySearch(i);
    }
    return i;
}

Will have to do some checking to make sure it doesn't go on infinitely on edge cases.

CookieOfFortune 2009-06-08 20:54:34

In my understanding of the tradition of code golfing, I was thinking about the shortest program. That's why mine reuses variables suma nd i for two cycles :)

ilya n. 2009-06-08 21:03:38

Are you looking for shortest or fastest?

CookieOfFortune 2009-06-08 21:12:07

Shortest! Anyway, your code and mine appears to be identical as for performance, and in general it's not possible to compare performance in different languages beyond O(...)

ilya n. 2009-06-08 21:24:31

yeah, it's going to have to be O(n) because of the sum, let me rethink this about writing it shorter, though... not really a practical reason for doing that.

CookieOfFortune 2009-06-08 21:26:34

All of solutions, including yours, will be likely O(n).

ilya n. 2009-06-08 21:27:47

Exactly, your algorithmic complexity is not helped by the binary search at all because the function as a whole is already O(n).

ephemient 2009-06-09 03:06:26

Answer 3

+3 A:

So, here's how I could squeeze my own solution:, still leaving some whitespaces:

    int s = 0, i = 0;
    for (; i < n; s += w[i++]) ;
    while ( (s -= 2*w[--i] ) > 0) ;
    a  =  x[i]  +  x[ !s && (w[i]==w[i-1]) ? i-1 : i ];

ilya n. 2009-06-08 21:26:26

Answer 4

+2 A:

ephemient 2009-06-08 22:58:08

Looks interesting, I'm parsing it... (I actually wrote I wanted to see some Haskell from the start, but was snubbed by functional languages crowd, so removed the idea...)

ilya n. 2009-06-09 00:43:42

Answer 5

+5 A:

J

Go ahead and type this directly into the interpreter. The prompt is three spaces, so the indented lines are user input.

   m=:-:@+/@(((2*+/\)I.+/)"1@(,:(\:i.@#))@[{"0 1(,:(\:i.@#))@])

The test data I used in my other answer:

   1 1 1 1 m 1 2 3 4
2.5
   1 1 2 1 m 1 2 3 4
3
   1 2 2 5 m 1 2 3 4
3.5
   1 2 2 6 m 1 2 3 4
4

The test data added to the question:

   (>:,:[)i.10
1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8  9
   (>:m[)i.10
6
   (([+<&6),:>:)i.9
1 2 3 4 5 6 6 7 8
1 2 3 4 5 6 7 8 9
   (([+<&6)m>:)i.9
6.5

   i =: (2 * +/\) I. +/

First index such that total sum is greater than or equal to double the accumulated sum.

   j =: ,: (\: i.@#)

List and its reverse.

   k =: i"1 @ j @ [

First indices such that -see above- of the left argument and its reverse.

   l =: k {"(0 1) j @ ]

Those indices extracted from the right argument and its reverse, respectively.

   m =: -: @ +/ @ l

Half the sum of the resulting list.

ephemient 2009-06-09 02:36:14

good one, i've still a lot to learn about J :P

Andrea Ambu 2009-06-09 08:00:14

Cool, very much what I wanted. There's something wrong with the Newton's method: the answer should always be half-integer.

ilya n. 2009-06-09 11:46:26

Hmm, it appears that comparisons are quite long. If there's a simple way to flop the array, it could be easier to find j in the same way as i but from the other side. Then (x[i]+x[j])/2 is always an answer (actually, x[i]+x[j] is a good answer since I modified the rules to stay within integers).Also, I give up on this one: why is the function with two arguments defined with 4 and 0?

ilya n. 2009-06-09 12:05:36

Never mind about m =:4 :0, I read how that's just a syntax for dyads.

ilya n. 2009-06-09 12:10:23

Still, could the third line be made shorter, like second?

ilya n. 2009-06-09 20:16:03

A change in strategy, so now there's only one line :)

ephemient 2009-06-09 21:42:07

Coooooooooooool

ilya n. 2009-06-09 23:04:23

Answer 6

+1 A:

short, and does what you'd expect. Not particularly space-efficient.

def f(l,i):
   x,y=[],sum(i)
   map(x.extend,([m]*n for m,n in zip(l,i)))
   return (x[y/2]+x[(y-1)/2])/2.

here's the constant-space version using itertools. it still has to iterate sum(i)/2 times so it won't beat the index-calculating algorithms.

from itertools import *
def f(l,i):
   y=sum(i)-1
   return sum(islice(
       chain(*([m]*n for m,n in zip(l,i))),
       y/2,
       (y+1)/2+1
   ))/(y%2+1.)

Jimmy 2009-06-19 19:03:10

Yes, using sum(i) amount of space might be an overkill... maybe in the future python arrays will be smart enough to not allocate the space?

ilya n. 2009-06-19 19:27:46

Answer 7

+1 A:

Python:

a=sum([[X]*W for X,W in zip(x,w)],[]);l=len(a);a[l/2]+a[(l-1)/2]

cobbal 2009-06-28 04:12:28

ansaurus

tags:

views:

answers:

Code Golf: Shortest code to find a weighted median?

Rules

What I hope to see

Test data

J

related questions