views:

292

answers:

4

What are your thoughts on this? I'm working on integrating some new data that's in a tab-delimited text format, and all of the decimal columns are kept as single integers; in order to determine the decimal amount you need to multiply the number by .01. It does this for things like percentages, weight and pricing information. For example, an item's price is expressed as 3259 in the data files, and when I want to display it I would need to multiply it in order to get the "real" amount of 32.59.

Do you think this is a good or bad idea? Should I be keeping my data structure identical to the one provided by the vendor, or should I make the database columns true decimals and use SSIS or some sort of ETL process to automatically multiply the integer columns into their decimal equivalent? At this point I haven't decided if I am going to use an ORM or Stored Procedures or what to retrieve the data, so I'm trying to think long term and decide which approach to use. I could also easily just handle this in code from a DTO or similar, something along the lines of:

public class Product
{
    // ...
    private int _price;
    public decimal Price
    {
        get
        {
            return (this._price * .01);
        }
        set
        {
            this._price = (value / .01);
        }
    }
}

But that seems like extra and unnecessary work on the part of a class. How would you approach this, keeping in mind that the data is provided in the integer format by a vendor that you will regularly need to get updates from.

A: 

Pedantically: they aren't kept as ints either. They are strings that require parsing.

Philisophically: you have information in the file and you should write data into the database. That means transforming the information in any ways necessary to make it meaningful/useful. If you don't do this transform up front, then you'll be doomed to repeat the transform across all consumers of the database.

There are some scenarios where you aren't allowed to transform the data, such as being able to answer the question: "What was in the file?". Those scenarios would require the data to be written as string - if the parse failed, you wouldn't have an accurate representation of the file.

David B
+1  A: 

Personally I would much prefer to have the data stored correctly in my database and just do a simple conversion every time an update comes in.

CodeMonkey1
+4  A: 

"Do you think this is a good or bad idea?"

Bad.

"Should I be keeping my data structure identical to the one provided by the vendor?"

No.

"Should I make the database columns true decimals?"

Yes.

It's so much simpler to do what's right. Currently, the data is transmitted with no "." to separate the whole numbers from the decimals; that doesn't any real significance.

The data is decimal. Decimal math works. Use the decimal math provided by your language and database. Don't invent your own version of Decimal arithmetic.

S.Lott
Very wise words; upvoting and will probably mark this as the answer! :) Normally I try to keep the data matching the vendor's data to minimize the work I have to do in converting it, but this seems like a good case to do, as you said, what's right instead of forcing it to adhere to the data.
Wayne M
Agreed. Also, consider what happens if the vendor decides to "fix" their data format. Then you would have code working around an evil that no longer exists...
RedFilter
A: 

In my mind the most important facet of using Decimal over Int in this scenario is maintainability.

Data stored in the tables should be clearly meaningful without need for arbitrary manipulation. If manipulation is required is should be clearly evident that it is (such as from the field name).

I recently dealt with data where days of the week were stored as values 2-8. You can not imagine the fall out this caused (testing didn't show the problem for a variety of reasons, but live use did cause political explosions).

If you do ever run in to such a situation, I would be absolutely certain to ensure data can not be written to or read from the table without use of stored procedures or views. This enables you to ensure the necessary manipulation is both enforced and documented. If you don't have both of these, some poor sod who follows you in the future will curse your very name.

Dems