ansaurus

Question

How can I round a column in a single SQL request without changing the overall sum ?

Answer 1

+3 A:

Update:

See this solution explained in more details in the article in my blog:

Rounding numbers preserving their sum

You need to keep cumulative offset for each value:

1.2   (1 + 0.0)  ~ 1    1   1.2   +0.2
1.2   (1 + 0.2)  ~ 1    2   2.4   +0.4
1.2   (1 + 0.4)  ~ 1    3   3.6   +0.6
1.2   (1 + 0.6)  ~ 2    5   4.8   -0.2
1.2   (1 - 0.2)  ~ 1    6   6.0   0.0

This is easily done in MySQL, but in SQL Server you will have to write a cursor or use cumulative subselects (which are less efficient).

Update:

The query below selects the difference between the sums of the values and of those rounded down to the nearest smaller integer.

This gives us the number (N) of values we should round up.

Then we order the values by their fractional part (ones that are closer to their ceiling go first) and round the first N up, the others down.

SELECT  value,
        FLOOR(value) + CASE WHEN ROW_NUMBER() OVER (ORDER BY value - FLOOR(value) DESC) <= cs THEN 1 ELSE 0 END AS nvalue
FROM    (
        SELECT  cs, value
        FROM    (
                SELECT  SUM(value) - SUM(FLOOR(value)) AS cs
                FROM    @mytable
                ) c
        CROSS JOIN
                @mytable
        ) q

Here's the script for the test data:

SET NOCOUNT ON
GO
SELECT  RAND(0.20090917)
DECLARE @mytable TABLE (value FLOAT NOT NULL)
DECLARE @cnt INT;
SET @cnt = 0;
WHILE @cnt < 100
BEGIN
        INSERT
        INTO    @mytable
        VALUES  (FLOOR(RAND() * 100) / 10)
        SET @cnt = @cnt + 1
END

INSERT
INTO    @mytable
SELECT  600 - SUM(value)
FROM    @mytable

Quassnoi 2009-09-17 10:39:24

@Quassnoi : thanks for your answer. With your solution, 1.4/1.4/1.2 would be rounded to 1/1/2 which isn't 'fair' since 1.4 is greater than 1.2.

Brann 2009-09-17 10:43:26

@Quassnoi : that's exactly the kind of thing I was looking for. Thanks !

Brann 2009-09-17 12:17:07

@Quassnoi : in my realworld problem, I've got another column 'ListID', and I want the sum per ListID to remain constant. I've adapted your request using group by clauses, and it almost works, BUT the ROW_NUMBER applies to my whole request, instead of the grouped by request. Is there something I can do?

Brann 2009-09-17 12:46:42

@Brann: could you please post some sample data and the desired resultset?

Quassnoi 2009-09-17 12:48:51

`@Brann`: meanwhile, try to replace `OVER (ORDER BY value)` with `OVER (PARTITTION BY listID ORDER BY value)`. This will return you the row numbers groupwise.

Quassnoi 2009-09-17 12:50:18

@Quassnoi : that's what I was looking for ; thanks !

Brann 2009-09-17 12:56:20

Answer 2

A:

First get the difference between the rounded sum and the actual sum, and the number of records:

declare @Sum float, @RoundedSum float, @Cnt int

select @Sum = sum(bar), @RoundedSum = sum(round(bar)), @Cnt = count(*)
from #tbFoo

Then you spread the difference equally on all values before rounding:

declare @Offset float

set @Offset = (@Sum - @RoundedSum) / @Cnt

select bar = round(bar + @Offset)
from #tbFoo

Guffa 2009-09-17 10:44:05

@guffa : if I'm not mistaking : 0.2/0.2/0.2/0.2/0.2 , sum=1, roundedsum=0, offest=0.2, finalresulst = 0/0/0/0/0, finalsum=0.

Brann 2009-09-17 10:57:08

@Brann: Yes, you are right, this doesn't give the right result. However, the principle of spreading the deviation equally is sound. I'll give it some more though...

Guffa 2009-09-17 11:32:48

Answer 3

+1 A:

If you have a list of n values whose elements are accurate only to within an integer value (+-0.5), then any sum of those elements will have a cumulative error or +-(n*0.5). If you have 6 elements in your list which should add up to some number, then your worst case scenario is that you're off by 3 if you just add the integer values.

If you find some way of showing 10.2 as 11 in order to make the sum work, you've changed the precision of that element from +-0.5 to +-0.8, which is counterintuitive when looking at integers?

One possible solution to think about is to round your number during display only (using some format string on your output), not already at the retrieval stage. Each number will be as close as possible to the actual value, but the sum will be more correct too.

Example: If you have 3 values of 1/3 each, displayed as whole-numbered percentages, then you should be showing 33, 33 and 33. To do anything else is to create a margin of error greater than +-0.5 for any individual value. Your total should still be displayed as 100%, because that is the best possible value (as opposed to working with sums of already rounded values)

Also, be aware that by using a float, you've already introduced a limitation on your precision because you have no way of accurately representing 0.1. For more on that, read What Every Computer Scientist Should Know About Floating-Point Arithmetic

Galghamon 2009-09-17 12:16:59

@Galghamon: Yes, showing 10.2 as 11 is exactly what I'm trying to achieve. And that's what the user wants to see, too.

Brann 2009-09-17 12:21:06

ansaurus

tags:

views:

answers:

How can I round a column in a single SQL request without changing the overall sum ?

related questions