tags:

views:

352

answers:

4

I have a need to write code that will prorate a value across a list, based on the relative weights of "basis" values in the list. Simply dividing the "basis" values by the sum of the "basis" values and then multiplying the factor by the original value to prorate works to a certain degree:

proratedValue = (basis / basisTotal) * prorationAmount;

However, the result of this calculation must then be rounded to integer values. The effect of the rounding means that the the sum of proratedValue for all items in the list may differ from the original prorationAmount.

Can anyone explain how to apply a "lossless" proration algorithm that proportionately distributes a value across a list as accurately as possible, without suffering from rounding errors?

+4  A: 

Simple algorithm sketch here...

  1. Have a running total which starts at zero.
  2. Do your standard "divide basis by total basis, then multiply by proration amount" for the first item.
  3. Store the original value of the running total elsewhere, then add the amount you just calculated in #2.
  4. Round both the old value and the new value of the running total to integers (don't modify the existing values, round them into separate variables), and take the difference.
  5. The number calculated in step 4 is the value assigned to the current basis.
  6. Repeat steps #2-5 for each basis.

This is guaranteed to have the total amount prorated equal to the input prorate amount, because you never actually modify the running total itself (you only take rounded values of it for other calculations, you don't write them back). What would have been an issue with integer rounding before is now dealt with, since the rounding error will add up over time in the running total and eventually push a value across the rounding threshold in the other direction.

Basic example:

Input basis: [0.2, 0.3, 0.3, 0.2]
Total prorate: 47

----

R used to indicate running total here:

R = 0

First basis:
  oldR = R [0]
  R += (0.2 / 1.0 * 47) [= 9.4]
  results[0] = int(R) - int(oldR) [= 9]

Second basis:
  oldR = R [9.4]
  R += (0.3 / 1.0 * 47) [+ 14.1, = 23.5 total]
  results[1] = int(R) - int(oldR) [23-9, = 14]

Third basis:
  oldR = R [23.5]
  R += (0.3 / 1.0 * 47) [+ 14.1, = 37.6 total]
  results[1] = int(R) - int(oldR) [38-23, = 15]

Fourth basis:
  oldR = R [37.6]
  R += (0.2 / 1.0 * 47) [+ 9.4, = 47 total]
  results[1] = int(R) - int(oldR) [47-38, = 9]

9+14+15+9 = 47
Amber
A: 

The problem you have is to define what an "acceptable" rounding policy is, or in other words, what it is you are trying to minimize. Consider first this situation: you have only 2 identical items in your list, and are trying to allocate 3 units. Ideally, you would want to allocate the same amount to each item (1.5), but that is clearly not going to happen. The "best" you could do is likely to allocate 1 and 2, or 2 and 1. So

  • there might be multiple solutions to each allocation
  • identical items may not receive an identical allocation

Then, I chose 1 and 2 over 0 and 3 because I assume that what you want is to minimize the difference between the perfect allocation, and the integer allocation. This might not be what you consider "a good allocation", and this is a question you need to think about: what would make an allocation better than another one?
One possible value function could be to minimize the "total error", i.e. the sum of the absolute values of the differences between your allocation and the "perfect", unconstrained allocation.
It sounds to me that something inspired by Branch and Bound could work, but it's non trivial.
Assuming that Dav solution always produces an allocation that satisfies the constraint (which I'll trust is the case), I assume that it is not guaranteed to give you the "best" solution, "best" defined by whatever distance/fit metric you end up adopting. My reason for this is that this is a greedy algorithm, which in integer programming problems can lead you to solutions which are really off the optimal solution. But if you can live with a "somewhat correct" allocation, then I say, go for it! Doing it "optimally" doesn't sound trivial.
Best of luck!

Mathias
You are correct that the algorithm I described will not always produce the "best" solution if one is looking to, say, minimize the differences between the "ideal" fractional values and the integer values assigned. It is, however, guaranteed to never be more than +/- 1 from the fractional value that would be assigned to each basis, which is likely the best you can do in an efficient manner.
Amber
Apparently 2 people disliked my answer enough to down-vote it; I'd love to understand why!
Mathias
@Mathias, I actually didn't fully comprehend the problem you were outlining. However, as I tried to implement the algorithm from the OP, I run into this selfsame problem. I'm trying to split (for example, 100 into 3 equal pieces). ONE of those pieces is going to have to be 34 and the other two 33. I am almost certain that the original algorithm (at the very least as implemented in the T-SQL above) doesn't handle that. Modify the original and I'll undo my downvote.
Jay Stevens
+1  A: 

Hello everyone,

Dav,

Thanks for your example and psudo-code. I've converted your code (see below) to run in SQL Query Analyzer but I'm running into a remainder problem. What do you think is the best way to handle the remainder? In the example below, the amount to prorate is 111 but the sum of the amount to adjust equals 108.

Any help would be appreciated.

-- Start of Code --

Drop Table #ProrateList

Create Table #ProrateList ( idno int , amt int , adjustment int )

Insert Into #ProrateList Values (1, 37841, 0)

Insert Into #ProrateList Values (2, 8995, 0)

Insert Into #ProrateList Values (3, 4583, 0)

Insert Into #ProrateList Values (4, 2086, 0)

Insert Into #ProrateList Values (5, 1584, 0)

Insert Into #ProrateList Values (6, 1102, 0)

Insert Into #ProrateList Values (7, 1065, 0)

Insert Into #ProrateList Values (8, 815, 0)

Insert Into #ProrateList Values (9, 809, 0)

Insert Into #ProrateList Values (10, 734, 0)

Insert Into #ProrateList Values (11, 317, 0)

Insert Into #ProrateList Values (12, 231, 0)

Insert Into #ProrateList Values (13, 109, 0)

Insert Into #ProrateList Values (14, 99, 0)

Insert Into #ProrateList Values (15, 38, 0)

Insert Into #ProrateList Values (16, 26, 0)

Insert Into #ProrateList Values (17, 25, 0)

Insert Into #ProrateList Values (18, 25, 0)

Insert Into #ProrateList Values (19, 21, 0)

Insert Into #ProrateList Values (20, 16, 0)

Insert Into #ProrateList Values (21, 15, 0)

Insert Into #ProrateList Values (22, 8, 0)

Declare @ProrateAmount Int

Declare @R Float

Declare @OldR Float

Declare @Results Float

Declare @IntTotal Float

Declare @idno Int

Declare @amt Float

Declare @SumAmt Float

Select @ProrateAmount = 111

Select @R = 0

Select @SumAmt = Sum(amt) From #ProrateList

-- Define the cursor Declare ProrationList Cursor For Select idno, amt From #ProrateList Order By amt Desc

-- Open the cursor

Open ProrationList

-- Assign the values of the first record

Fetch Next From ProrationList Into @idno, @amt

-- Loop through the records

While @@FETCH_STATUS = 0

Begin

Select @OldR = @R

Select @R = @R + (@amt / @SumAmt) * @ProrateAmount

Select @Results = @R - @OldR

If Round(@Results, 0) <> 0

Begin

Update #ProrateList Set adjustment = Round(@Results, 0)

Where idno = @idno

End

-- Assign the values of the next record

Fetch Next From ProrationList Into @idno, @amt

End

-- Close the cursor

Close ProrationList

Deallocate ProrationList

Select * From #ProrateList

Select Sum(adjustment) From #ProrateList

-- End of Code --

jtw
could someone please edit this post and format as code?
Jay Stevens
A: 

Ok. I'm pretty certain that the original algorithm (as written) and the code posted (as written) doesn't quite answer the mail for the test case outlined by @Mathias.

My intended use of this algorithm is a slightly more specific application. Rather than calculating the % using (@amt / @SumAmt) as shown in the original question. I have a fixed $ amount that needs to be split or spread across multiple items based on a % split defined for each of those items. The split % sums to 100%, however, straight multiplication often results in decimals that (when forced to round to whole $) don't add up to the total amount that I'm splitting apart. This is the core of the problem.

I'm fairly certain that the original answer from @Dav doesn't work in cases where (as @Mathias described) the rounded values are equal across multiple slices. This problem with the original algorithm and code can be summed up with one test case:

Take $100 and split it 3 ways using 33.333333% as your percentage.

Using the code posted by @jtw (assuming this is an accurate implementation of the original algorithm), yields you the incorrect answer of allocating $33 to each item (resulting in an overall sum of $99), so it fails the test.

I think a more accurate algorithm might be:

  • Have a running total which starts at 0
  • For each item in the group:
  • Calculate the un-rounded allocation amount as ( [Amount to be Split] * [% to Split] )
  • Calculate the cumulative Remainder as [Remainder] + ( [UnRounded Amount] - [Rounded Amount] )
  • If Round( [Remainder], 0 ) > 1 OR the current item is the LAST ITEM in the list, then set the item's allocation = [Rounded Amount] + Round( [Remainder], 0 )
  • else set item's allocation = [Rounded Amount]
  • Repeat for next item

Implemented in T-SQL, it looks like this:

-- Start of Code --
Drop Table #SplitList
Create Table #SplitList ( idno int , pctsplit decimal(5, 4), amt int , roundedAmt int )

-- Test Case #1
--Insert Into #SplitList Values (1, 0.3333, 100, 0)
--Insert Into #SplitList Values (2, 0.3333, 100, 0)
--Insert Into #SplitList Values (3, 0.3333, 100, 0)

-- Test Case #2
--Insert Into #SplitList Values (1, 0.20, 57, 0)
--Insert Into #SplitList Values (2, 0.20, 57, 0)
--Insert Into #SplitList Values (3, 0.20, 57, 0)
--Insert Into #SplitList Values (4, 0.20, 57, 0)
--Insert Into #SplitList Values (5, 0.20, 57, 0)

-- Test Case #3
--Insert Into #SplitList Values (1, 0.43, 10, 0)
--Insert Into #SplitList Values (2, 0.22, 10, 0)
--Insert Into #SplitList Values (3, 0.11, 10, 0)
--Insert Into #SplitList Values (4, 0.24, 10, 0)

-- Test Case #4
Insert Into #SplitList Values (1, 0.50, 75, 0)
Insert Into #SplitList Values (2, 0.50, 75, 0)

Declare @R Float
Declare @Results Float
Declare @unroundedAmt Float
Declare @idno Int
Declare @roundedAmt Int
Declare @amt Float
Declare @pctsplit Float
declare @rowCnt int

Select @R = 0
select @rowCnt = 0

-- Define the cursor 
Declare SplitList Cursor For 
Select idno, pctsplit, amt, roundedAmt From #SplitList Order By amt Desc
-- Open the cursor
Open SplitList

-- Assign the values of the first record
Fetch Next From SplitList Into @idno, @pctsplit, @amt, @roundedAmt
-- Loop through the records
While @@FETCH_STATUS = 0

Begin
    -- Get derived Amounts from cursor
    select @unroundedAmt = ( @amt * @pctsplit )
    select @roundedAmt = Round( @unroundedAmt, 0 )

    -- Remainder
    Select @R = @R + @unroundedAmt - @roundedAmt
    select @rowCnt = @rowCnt + 1

    -- Magic Happens!  (aka Secret Sauce)
    if ( round(@R, 0 ) >= 1 ) or ( @@CURSOR_ROWS = @rowCnt ) Begin
        select @Results = @roundedAmt + round( @R, 0 )
        select @R = @R - round( @R, 0 )
    End
    else Begin
        Select @Results = @roundedAmt
    End

    If Round(@Results, 0) <> 0
    Begin
        Update #SplitList Set roundedAmt = @Results Where idno = @idno
    End

    -- Assign the values of the next record
    Fetch Next From SplitList Into @idno, @pctsplit, @amt, @roundedAmt
End

-- Close the cursor
Close SplitList
Deallocate SplitList

-- Now do the check
Select * From #SplitList
Select Sum(roundedAmt), max( amt ), 
case when max(amt) <> sum(roundedamt) then 'ERROR' else 'OK' end as Test 
From #SplitList

-- End of Code --

Which yields a final result set for the test case of:

idno   pctsplit   amt     roundedAmt
1      0.3333    100     33
2      0.3333    100     34
3      0.3333    100     33

As near as I can tell (and I've got several test cases in the code), this handles all of these situations pretty gracefully.

Jay Stevens