Suppose you have a table RULES with 3 columns A, B, and C. As data enters the system, I want to know if any row of the RULES table matches my data with the condition that if the corresponding column in the RULES table is null, all data matches. The obvious SQL is:
SELECT * FROM RULES
WHERE (A = :a OR A IS NULL)
AND (B = :b OR B IS NULL)
AND (C = :c OR C IS NULL)
So if I have rules:
RULE A B C 1 50 NULL NULL 2 51 xyz NULL 3 51 NULL 123 4 NULL xyz 456
An input of (50, xyz, 456) will match rules 1 and 4.
Question: Is there a better way to do this? With only 3 fields this is no problem. But the actual table will have 15 columns and I worry about how well that SQL scales.
Speculation: An alternative SQL statement I came up with involved adding an extra column to the table with a count of how many fields are not null. (So in the example, this columns value for rules 1-4 is 1, 2, 2 and 2 respectively.) With this "col_count" column, the select could be:
SELECT * FROM RULES
WHERE (CASE WHEN A = :a THEN 1 ELSE 0 END)
+ (CASE WHEN B = :b THEN 1 ELSE 0 END)
+ (CASE WHEN C = :c THEN 1 ELSE 0 END)
= COL_COUNT
Unfortunately, I don't have enough sample data to find our which of these approaches would perform better. Before I start creating random rules, I thought I'd ask here whether there was a better approach.
Note: Data mining techniques and column constraints are not feasible here. The data must be checked as it enters the system and so it can be flagged pass/fail immediately. And, the users control the addition or removal of rules so I can't convert the rules into column constraints or other data definition statements.
One last thing, in the end I need a list of all the rules that the data fails to pass. The solution cannot abort at the first failure.
Thanks.