views:

306

answers:

8

I inherited a project that uses SQL Server 200x, wherein a column that stores a value that is always considered as a percentage in the problem domain is stored as its greater than 1 decimal equivalent. For example, 70% (0.7, literally) is stored as 70, 100% as 100, etc. Aside from the need to remember to * 0.01 on retrieved values and * 100 before persisting values, it doesn't seem to be a problem in and of itself. It does make my head explode though... so is there a good reason for it that I'm missing? Are there compelling reasons to fix it, given that there is a fair amount of code written to work with the pseudo-percentages?

There are a few cases where greater than 100% occurs, but I don't see why the value wouldn't just be stored as 1.05, for example, in those cases.

EDIT: Head feeling better, and slightly smarter. Thanks for all the insights.

+5  A: 

If its a byte field then it takes up less room in the db than floating point numbers, but unless you have millions and millions of records, you'll hardly see a difference.

Andrew Bullock
That does make some sense given then tendencies of the previous designer. Fortunately, I can't see this thing getting larger than 2-3 million records in 5-10 years so it probably won't be an issue. Good to know though, for peace of mind :-)
I Have the Hat
I would also add that it could depend on the application reading the data. Perhaps there is some reason they don't want to "mess" with decimals?
Nick DeVore
Sounds like a case of premature optimization to me.
Erik Forbes
+2  A: 

A good guess is because anything you do with integers (storing, calculating, stuffing into an edit for for a user, etc.) is marginally easier and more efficient than doing the same with floating point numbers. And the rounding issues aren't so obvious when you look at the data.

le dorfier
+3  A: 

Floating point numbers are prone to rounding errors and, therefore, can act "funny" in comparisons. If you always want to deal with it as fixed decimal, you could either choose a decimal type, say decimal(5,2), or do the convert and store as int thing that your db does. I'd probably go the decimal route, even though the int would take up less space.

tvanfosson
Ah yes, rounding errors with floating point numbers. I think I would use Decimal(x,x) too, as space isn't likely to be an issue. Oddly enough, my predecessor did both: choose a decimal for the column type, *and* stored the percentages as whole numbers :-)
I Have the Hat
+4  A: 

Since floating-point values can't be compared for equality, an integer may have been used to make the SQL simpler.

For example

(0.3==3*.1)

is usually False.

However

abs( 0.3 - 3*.1 )

Is a tiny number (5.55e-17). But it's pain to have to do everything with (column-SomeValue) BETWEEN -0.0001 AND 0.0001 or ABS(column-SomeValue) < 0.0001. You'd rather do column = SomeValue in your WHERE clause.

S.Lott
This is only true when mixing bases... i.e. (no pun on ieee intended) When the floating point number is stored in binary, and the numbers you're trying to represent are decimal, then they can't be compared for equality. If thr bases ar ethe same, they can.
Charles Bretana
Not sure I get your point. 0.3 != 3*.1 The issue is basically true for floats on all intel platforms. That's a fact about floats that can make them undesirable for databases when equality tests are involved.
S.Lott
+1  A: 

If these are numbers that end users are likely to see and interact with, percentages are easier to understand than decimals.

This is one of those situations where a notation aid can help; in the program, be consistent in using a prefix (Hungarian) or postfix to specify values that are percentages vs. those that are decimal. If you can extend a naming convention to the database fields themselves, so much the better.

Mark Ransom
A: 

And to add to the data storage issue, if you can use integer arithmetic for whatever processing you are doing, the performance is much better than when doing floating point arithmetic... So storing ther percetages as integer values may allow the processing logic to itilize integer arithmetic

Charles Bretana
A: 

If you're actually using them as a coefficient (or expect users of the database to do this sort of thing in reports), there's a case for storing them as a coefficient - particularly if there's a reason to do calculations involving more than one.

However, if you do this you should be consistent - either all percentages or all coefficients.

ConcernedOfTunbridgeWells
+5  A: 

There are actually four good reasons I can think of that you might want to store—and calculate with—whole-number percentage values rather than floating-point equivalents:

  1. Depending on the data types chosen, the integer value may take up less space.
  2. Depending on the data type, the floating-point value may lose precision (remember that not all languages have a data type equivalent to SQL Server's decimal type).
  3. If the value will be input from or output to the user very frequently, it may be more convenient to keep it in a more user-friendly format (decision between convert when you display and convert when you calculate ... but see the next point).
  4. If the principle values are also integers, then

    principle * integerPercentage / 100
    

    which uses all integer arithmetic is usually faster than its floating-point equivalent (likely significantly faster in the case of a floating-point type equivalent to T-SQL's decimal type).

P Daddy