views:

130

answers:

8

I was arguing with my friend against his suggestion to store price, value and other similar informations in varchar.

My point of view are on the basis of

  1. Calculations will become difficult as we need to cast back and forth.
  2. Integrity of the data will be lost.
  3. Poor performance of Indexes
  4. Sorting and aggregate functions will also need casting

etc. etc.

But he was saying that in his previous employement everybody used to store such values in varchar, because the communication between DB and the APP will be very effective in this approach. (I still cant accept this)

Are there really some advantages in storing such values in varchar ?

Note : I'm not talking about columns like PhoneNo, IDs, ZIP Code, SSN etc. I know varchar is best suited for those. The columns are value based, and will for sure be involved in calculations some way or other.

+10  A: 

None at all.

Try casting a values back and too and see how much data you lose.

DECLARE @foo TABLE (bar varchar(30))
INSERT @foo VALUES (11.2222222222)
INSERT @foo VALUES (22.3333333333)
INSERT @foo VALUES (33.1111111111)
SELECT CAST(CAST(bar AS float) AS varchar(30)) FROM @foo

I would also mention that his current employment does things differently... he isn't at his previous employment any more....

gbn
+1 Exactly. What an absurd idea!
Josh Stodola
<<< his current employment does things differently... he isn't at his previous employment any more >>> - Almost the same words I told him yesterday. Got curious to know whether am I missing something.
The King
Maybe you should tell him that if he keeps coming up with ideas like this, this job may soon become his "previous employment", if you get my drift ...
Jay
+4  A: 

I think a big part of the reason to use the APPROPRIATE (in this case decimal) data type is to prevent invalid data. There's nothing to stop someone entering "The King" as a price in a varchar field.

JNK
On the lighter side... His counter argument was to make the necessary validations in APP... Also not to give anybody access to enter data directly in the Database...
The King
Thats just silly though. Then you are doing a cast/convert for EVERY entry into and every query out of the db.
JNK
Or even worse than entering "The King" as price imagine someone entering a string that is a valid expression. e.g. an expression that evaluates to a negative value - your app may end up issuing credits AND dispatching products!
Jason Tan
A: 

Data Types are best stored in fields that match the type between two different systems. In this case you are referring from your .Net objects to MS SQL server. You are correct with data integrity loss and with the need to cast/convert data types into useable forms. As for other types such as Phone Number, ZIP Code, SSN and so on; they too would benefit from dedicated data types. The main reason these are stored in VARCHAR/NVARCHAR is due to the number of different possibilities that are not needed in every system. But if you have a type that is commonly used and you want to constrain it you can build custom data types called User-defined types to store that data in SQL server. (Even more fun is CLR defined types see example on code project.)

Matthew Whited
+3  A: 

I can see no advantages, and a whole heap of very severe disadvantages - the most pressing of which is performance (particularly when sorting).

Consider if you want to get a list of the N most expensive products, and you are storing your price as a VARCHAR. Here are some sample values (sorted in descending order)

SELECT Price FROM Table ORDER BY Price DESC

Price
-----

90
600
50
1000

Whoops! The sort order is, well, wrong! (Alphanumerical sorting, rather than value sorting).

If we want to do the sort properly then this means we either need to pad values with zeroes at the start, or convert each value to a double before we sort - but if we have to do a convert on every row this means that SQL server has no way of using statistics to predict what the results will be! This in turn means extremely poor performance, probably a table scan.

Kragen
+2  A: 

As Kragen notes, sorts will not necessarily come out in the right order.

Compares won't necessarily work either. If a field is defined as, say, decimal(8,2) and I give it the value "37.20", and later I write "select ... where price=37.2", the result will be true. But if I store a varchar 37.20 and compare it to 37.2, it will not be equal. Similarly if one or the other has leading zeros.

You could solve these problems by having the application insure that you always store the numbers with a fixed number of decimal places and padded with leading zeros. Oh, and make sure you have a consistent convention about storing minus signs. But then every place in the app that writes to this field must be sure that it follows exactly the same rules. We could do this of course, but why? The database engine will do it for us if we just declare the field numeric. Like, yes, I COULD mow my lawn with a pair of scissors, but why would I want to do this?

I don't understand what your friend is saying the advantage is supposed to be. Easier communication between app and database? How? Maybe he was using some unconventional language or database interface that couldn't read numeric values from the DB. I've never had an issue with this. Actually just saying that gets me to wondering if that isn't what happenned: That at his previous company they were using some language or tool that couldn't read decimals from the database because of an implementation problem, the only way they could get it to work was to declare all the numbers as varchar, and now he walks away thinking that's a generally good idea.

Jay
+1. Even I dont see any problem in passing values as decimal between my .NET app and Sql Server. Something must be fishy about his earlier organisation.
The King
+1  A: 

Ok . One word answer . Dont

You are right about correct data types having impact on performance (SQL Optimizer works differently for INT VS VARCHAR) , data consistency and integrity etc

if all we needed was VARCHAR I dont think we ever invented other types. SQL is not dynamically typed. Static typing makes optimization better , index pages smaller and query operators efficient.

It is not the problem of source that consumer needs all strings as input. it is upto consumer to do type checking and consuming data. A DB should always have correct types .

(Forget about choosing between INT and VARCHAR i would say you should also think whether you should have INT or TINYINT ) these consideration makes a lot of difference

Ashwani Roy
A: 

The only advantage I can see with using any sort of variable-sized string-ish format would be if the field would have to accommodate an unknown amount of additional information. For example, "49.95@1/39.95@5/29.95@20/14.95@100,match=true/24.95@100" to indicate that this particular product has price points at 1, 5, 20, and 100 units, and the best 100-unit price is only available when all items are identical. Using strings to store such things is icky, but if the number of price-points is open-ended, using a variable-sized field might be better than having to create another table with one row per product/price-point combination. If you do go that route, it may be good to use XML serialization for the data, rather than an ad-hoc thing as shown above. An ad-hoc approach might allow faster parsing in some cases, but if things really are open-ended it could become a real pain to maintain.

Addendum: If you want to be able to do any type of sorting or searching based on price, you'll need to have separate columns for that. If you want to allow users to e.g. find the ten cheapest items at 100-piece mix/match quantity, and the database holds 10,000 possible items, the only way to satisfy the query with varchar-stored data would be to read all l0,000 items and evaluate what the best price would be given the restrictions. If users can only query based upon a small number of price/restriction combinations, it may be helpful to have a column for each one to allow direct queries.

supercat
If we had multiple prices depending on quantity or discount program or whatever, I'd create a separate table and record the data cleanly. That's what normalization is all about. Now if the value isn't REALLY a price as in $X, but is really, I don't know, the name of the price list where we can look up the price or something like that, that would be a different story.
Jay
@Jay: I would agree that if it's necessary to do practically anything with price other than set or retrieve the price information associated with a given item, it needs to be stored 'cleanly'. On the other hand, if the necessary operations are restricted to setting or retrieving the price for a given item, those operations are simpler with a text field than with separate tables. Since the question was whether there was ever ANY advantage to text storage, I would answer in the affirmative. It isn't often the best way, but nor is it without advantages.
supercat
A: 

----my answer was wrong but I can't delete it!

Toby Allen