views:

433

answers:

10

Shouldn't the code deal with default values instead of the database?

+4  A: 

That depends on how you think of the "default" value. Think of it this way: What should happen if you change the default? If existing values should be updated, then the default should be in the program code only, but if the existing values should remain, then you should store the default in the database.

Rasmus Kaj
+3  A: 

I generally believe code should take care of default values. This will help keep your DB lean and mean. Although, if it seems your DB has an excessive amount of empty fields you may want to rethink the design.

EXAMPLE

Imagine you have a table with a million or more rows. In this table you have a datetime column that is filled in maybe 5% of the time. As time goes on the amount of space you'll save by storing a NULL default will make the necessary default checks more than worth it.

ChaosPandion
I don't agree - this should definitely be handled by the database. Otherwise, if someone manages to connect to your data some other way than through your app, they could potentially enter invalid or useless data.
marc_s
@marc_s: but this is a generic "security" concern IMO.
jldupont
@marc_s- The question is whether or not people will connect. I've worked on apps where this has definitely been the case and other apps where the database is completely silo'd to a specific app. I guess another issue is whether this will change over time.
RichardOD
WHy would you want a "Lean and Mean" database over one that has data integrity?
HLGEM
How does a null datetime rather than 1900/01/01 cause less integrity?
ChaosPandion
The semantics of NULL are different from 1900/01/01; the former indicates the absence of a value, the latter is a default.
Rob
marc_s, Defaults don't prevent people from entering bad data (at least in SQL Server). They account for data that is not specified at the time of insertion.
Philip Kelley
A default that takes 8 bytes and will most likely be ignored anyway.
ChaosPandion
+15  A: 

Anything you can do in the database is typically more robust. If you handle default values which will be used if no value has been specified only in your app, and someone manages to connect to your database some other way than via your app (and believe me - the users WILL try to connect using Excel or other tools) - then the database is wide open and the user could potentially insert crappy data.

Same goes for referential integrity and check constraints. I believe you're better off if you try to have as many of those constraints on the database - then no matter how the user connects, if he doesn't send you anything, you can put in sensible defaults.

Don't let the application handle your checks - leave that to the database!

Plus it makes your SQL insert statement a lot leaner and meaner if you don't have to specify all the "obvious" defaults, like "getdate()" for a "LastChangedOn" date column etc.

marc_s
I think he means leave things like dates and such NULL in the DB and display default values in client code.
ChaosPandion
go ahead and tie yourself closer to the database... this way switching to another solution (if the situation requires it) will be simpler.... not!
jldupont
jldupont- using database specific features doesn't lock you in. I've seen people not use primary keys because of this very same (flawed) reasoning!
RichardOD
jldupont: how often do you switch databases vs. how often do you change code possibly breaking a validation process?
simon
@jldupont: Default values are portable across virtually every SQL database system there is. If you are that concerned about your application being dependent on the data store (which is a legitimate concern in some cases) you should probably consider rolling your own. But if you're going to use a DBMS, let the DBMS do the work it was designed to do.
Larry Lustig
@Mark- I think also what a lot of people forget is using these features in a DB makes the DB self documenting (I know I can join table x to table y). I know all of the WTF DBs I've seen don't use referential integrity.
RichardOD
@Larry- that's a good point. You can easily use DB specific features and allow for different backends.
RichardOD
And what if you set the column to NOT NULL? Then someone connecting some other way than via my app will also have to insert something. Your last argument is definitely through because that's one of the reasons I'm asking this. thx.
Lieven Cardoen
http://thedailywtf.com/Articles/Slightly-OverSQLd.aspx
Egg
Man, that was poor English. Through should be true...
Lieven Cardoen
+3  A: 

Just the opposite. The database should always provide default values that ensure that records added are valid and meaningful. Remember, you cannot predict at the time you write the application what code will eventually be adding records to the database.

That's not to say that you may not have a second level of default data specified by the application that depends on the application state at the time the data is created.

So, imagine that you have an employee status code that can be "P" (propect), "A" (active), "T" (terminated), or "R" (retired). You should specify at the database level whether people enter the system as "P" or "A", for instance (or perhaps a fifth code for Unassigned). But your application can and should require the user creating the employee record to select one of the options from (for instance) a group of radio buttons and use that value when INSERTing the record.

Larry Lustig
+1  A: 

You may want to consider what is required to change the defaults. If you application is in house it may not be difficult to change, but if it is on your customers site, then making changes to the database can be a VERY difficult process. If you have hundreds or thousands of clients and you need to convince their DBAs to give you update process SA access to the database you will regret putting any application logic in the DB (including defaults).

Jim Blizard
+2  A: 

As with most "code vs database" arguments, it depends.

If your database (or table) is going to be accessed by a lot of different customers (i.e. teams who don't work closely together), then the database should protect itself. Ideally, everything should be done via stored procedures.

If a single application has access to the database, then that application can contain the business logic. In this case, everything should be done by the one application, and that application accesses the database.

Also: If you want a cheap answer - ask the person who is in charge of specifying (and documenting) your data models. If you don't have one, start panicking.

wisty
+1. @Wisty- this is a good point.
RichardOD
With respect, I'm always wary of deigns that begin with "If the simplest case. . .". In my experience you can never know in advance whether only a single application will access the database and, in fact, it's a very good bet that no matter what the plan at the outset is that various utility programs will be added in the future for bulk loading, maintenance, schema migrations, etc. Since there's no _advantage_ to having the default in the code even in the simplest case, why not simultaneously future-proof and simplify the application?
Larry Lustig
+2  A: 

Code defaults are easier for unit testing.

Code defaults support multiple scenarios. DB column defaults are a one size fits all. For example, the DB column defaults may vary depending on customer type.

DB column defaults are often opaque to the maintenance developer because they are far away from the INSERT statement, which is usually in the middle tier code, stored procedure, etc. The presence or absence of defaults can be surprising, either way.

DB column defaults defend the database against clients that are too lazy to fill in defaults at all, which is a form of data corruption.

Both kinds of defaults can be subverted by client developers. It is easier to set up barriers to developers using defective defaults in the middle tier. AFAIK, no database lets you require a field take the default value on INSERT. An example of where this might matter is the various tokens people use for a unknown but future date, or an unknown but past date, or it might matter if they use GETDATE() which includes time or if they use a default date with year, month, day and no time.

I'd recommend making sure defaults exist where they make sense in the DB, but don't actually use them. The DB defaults should be the defaults of last resort, and defaults should be solidly in a middle tier (i.e. stored procs, a data access layer). A DB column default is like an exception handler-- when someone forgot by accident to provide a value, what value should be used to prevent data corruption?

MatthewMartin
+1  A: 

If you are concerned about data integrity (And if you aren't why do you have a database?), you need the default values in the database where they belong. To do this only in code is irresponsible. Data gets into databases from other sources than application code.

HLGEM
+1  A: 

Databases are intended to record assertions of fact.

To allow a user to make incomplete assertions and to have the dbms make silent assumptions about the incomplete part, is plain bad.

If you want robust, reliable systems that can be kept in use for longer times, then don't let any part of that system ever make assumptions about what the user does not say.

Handle defaults at the presentation level and nowhere else. And handle them in such a way that there is no way for the user to not have seen the data he submits (ALL the data !).

Erwin Smout
+1  A: 

Unless the value is completely decoupled from your business logic, I would say yes, it is bad practice. At least according to the tiered model, where the business layer is separate from the data layer.

Remember that:
- Putting a default value on a column does not guarantee data integrity (an application should be able to handle any value put in the column)
- Going to the trouble and expense of somehow duplicating your business logic in the database doesn't guarantee data integrity.

The most sane approach to database security is to have just that -- database security. If a malicious user, for example, has managed to break that security layer, then your data doesn't stand a chance, regardless of the data constraints.

Jon Seigel