views:

235

answers:

8

I'm working through a wide variety of procs that have a WHERE clauses that look like this:

WHERE ... AND ( ( myTbl.myValue = 1234) 
    or (myTbl.myValue = 1235) )-- Value =  No

I've talked this over with some colleagues and this sort of code seems unavoidable. I think it's a good idea to put this sort of code into one (and only one) place. That might be a view, it might be a table etc. I'm thinking a view that selects from the underlying table and has a bit field that says value of 1234 or 1235 is a 0. Or a 'N', etc. That way I can add 'No' values at will without having to change any code. I wouldn't use a UDF for this, too many function calls if you use it in a join.

What are some other options for dealing with special values in your database? Are views a good solution for this? Are there any ways to avoid this sort of thing altogether? I'm thinking that if that value needs to change for whatever reason I don't want to deal with changing hundreds of procs. On the other hand, extra join so it's a performance hit.

Update: if anyone has strategies for just getting rid of the damn things that'd be great too. Like I said, talked it over with colleagues and these things seem unavoidable in organizations that have a lot of business logic in the db layer.

PS: I saw some magic number questions, but nothing specific to a database.

+2  A: 

On Oracle you can set up deterministic functions this kind of functions that notice the RDBMS that only is need to be called once.

create or replace package MAGIC is
  function magic_number return  number DETERMINISTIC;
end;
/

create or replace package body MAGIC is
  function magic_number return  number DETERMINISTIC
  is
  begin
    return 123;
  end;
end;
/

SELECT MAGIC_DATE
  FROM MAGIC_TABLE
 WHERE MAGIC_ID = magic.magic_number;
FerranB
ah, is this problem specific to SQL Server; as in it's not really a problem elsewhere? I've only worked in SQL Server and SQLite.
jcollum
You can also create deterministic UDFs on SQL Server, I've used both this approach and the "Configurations" table approach proposed by Ken White et all. The UDF gives you better performance, but values aren't as straight-forward to change.
Joe Pineda
A: 

As we do queries as Strings this is usually something like

"AND myTbl.iTpe = "+AnEnum.THE_TYPE.ordinal()+"..."
John Nilsson
is this a database specific solution?
jcollum
Nope. It's just the reason why we don't have the problem in the database, and thus no solution for it. This is done in Java, within the application calling the DB.
John Nilsson
This is not a good idea... the ordinal of an enum is subject of change, a simple re-ordering will break all your data in the database. Add a method like #getDatabaseValue() for your enum types for a way better approach.
Malax
A: 

Using SqlServer and probably other dialects you could create functions which return the magic number, so your sql would become

WHERE ... AND ( ( myTbl.myValue = MagicFunction1())     or (myTbl.myValue = MagicFunction2()) )-- Value =  No

Alternatively you could create a single function, or maybe one function/logical set of magic numbers and pass in a parameter.

WHERE ... AND ( ( myTbl.myValue = ProductFunction(1))     or (myTbl.myValue = ProductFunction(2)) )-- Value =  No
MrTelly
If I do a join on this where clause those functions could be running ... a lot. Bad idea. And I addressed this in the question. Let me know if you think I didn't address it thoroughly.
jcollum
A lot of calls...
FerranB
Yep you're right, too many calls, inefficient and the question excludes this approach
MrTelly
+3  A: 

How many magic numbers are we talking about? If it's less than a few thousand, put them in a table and do a join. If they're frequently used in WHERE clauses as a consistent grouping (like the 1234 and 1235 in your example), assign a category column and use IN or EXISTS. (Neither of those will be a meaningful performance hit with a small amount of magic numbers and an appropriate index.)

I detest hard-coded numeric values like those in your example for anything except ad hoc SQL statements. It makes maintenance a real PITA later on.

Ken White
"I detest hard-coded numeric values like those in your example": me too, but I've run into a situation more than a few times where I have to deal with them. Sometimes a bit field just won't cut it.
jcollum
True. I try very, very, very hard to come up with a different way, but once in a while there's just no viable alternative.
Ken White
By far the best answer I think.
Bill K
If you're not careful, doesn't that replace 1234 with "(SELECT number FROM CodeTable WHERE Value = 'MyMeaningfulName)", or a stored procedure call NumberLookup('MyMeaningfulName') that does that behind the scenes? In fact, how do you avoid such? I don't see the sub-query version as an improvement.
Jonathan Leffler
Yeah, it's also the answer that we came up with when we rolled this around. I'll wait a bit and then mark it as the answer.
jcollum
@Leffler: no, the point of this solution is that it allows you to use set based logic to solve the problem, which is ideal. So you join to your MagicValueTable and join to it on an int field that has an index, ideally.
jcollum
@Leffler: What jcollum said. <g> Thanks, jcollum. You got my meaning exactly.
Ken White
I think that this is the logically correct answer, however it can be detrimental to performance where the execution plan ought to be sensitive to data skew in the predicated column (eg. where 99.9% of values are 123 and the remaining ones are 0). Just be aware of that.
David Aldridge
@David: Yeah, I guess I presumed from the tenor of the original question jcollum had the common sense to figure that out on his own. "If 99.9% of the time, it's 1, the other 0.1% it's 0, this may not be the optimal answer" seemed like something he'd grasp. :-)
Ken White
wait let's not assume anything about my intelligence for good or bad :)
jcollum
I see what you mean about data skew, but I think that'd be addressed on an as needed basis. I think the gain in adaptability of code much greater than the loss of having to address why some procs aren't running as fast as they could be. Tradeoffs, always with the tradeoffs.
jcollum
@jcollum: <g> Noted for future reference.
Ken White
A: 

Magic numbers are always avoidable, but avoiding them might not always be ideal (eg the performance hit you mentioned). In perfect code you would have another table in the database with a simple identifier (a string) like myMagicNumber that you reference.

However I've found the best solution is to keep a lookup (function/enum etc) in your program code (or SP if that's what you're using). Make sure you always use that lookup to add or select data from the DB.

DisgruntledGoat
sweet user name!
jcollum
+1  A: 

I agree that magic numbers of this sort should all go in one place. Where that place should be probably depends on the culture around your app. If there is a common config file or area, especially a cascading set of configs, thats a good place.

I guess the first part you should figure is what's so magic about your number. Is it configuration sensitive? is it a mathematical constant? is it a value defined by some outside standard spec? knowing that might help locate the magic number's definition.

TokenMacGuy
A: 

I find that a lot of the older generation of developers like to put comments in their code and provide system documentation to back it up. I don't hold with it myself, mind you -- just let people guess what it all means.

David Aldridge
I guess I qualify as one of that "older generation" (I'm in my 50's). What does that have to do with hard-coding magic numbers in SQL statements?
Ken White
agreed, not relevant
jcollum
Hmm, well you maybe missed the irony in my statement. The relevance is that if you use a magic value then you put a comment in there to identify what it means, and you provide documentation to list the values and what they mean in the context of each column. Btw, I'm in my 40's.
David Aldridge
I'm of the mind that says that variables and column names should be descriptive enough that comments aren't needed 90% of the time; comments can become a second codebase to maintain and they're worthless if they get out of sync with the actual code; I'm not against them, just think they're overused
jcollum
overused some of the time that is; the people who think that you should have comments every third line or something need to work on writing descriptive names and making smaller methods
jcollum
A: 

Often, magic numbers like this can be stored in the database and loaded into static readonly variables when the database is first started.

Robert Lewis
the question is database specific; unfortunately many companies keep most or all of their logic in their data layer
jcollum