views:

163

answers:

4

In our application we're going to be allowing users to type in arithmetic expressions (+ - * /) using other database columns for numbers, which would then be parsed by the application and written into the database as a computed column.

However there are problems that arise with allowing users to create expressions that could cause exceptions when you select * the table, such as divide-by-zero, arithmetic-overflow, and possible others I haven't come across yet (though I think that's all of them).

Having the database throw an exception on select * would be absolutely devastating. I would rather try to rewrite their expression into something that would fail gracefully if they have error-prone data.

For divide-by-zero the solution is pretty straightforward:

add [Col] as case {divisor} when 0 then N'DIVIDE-BY-ZERO' else {expression} end

My question is what can I do for arithmetic overflow? Showing bunk or obviously wrong data in the column would not be a problem, but throwing exceptions would.

A: 

How about catching the exceptions? It seems hard to try to detect all possible causes of arithmetic overflows, let alone exceptions in general.

On exceptions, you can return a nonsense result set that is compatible with the original user-defined one:

begin try
    select exp(999)
end try
begin catch
    select 1
end catch
Jeff Meatball Yang
Yes, this is certainly going to be a fun one. The problem with using exceptions is that we've OEMed third-party software which interacts directly with the database (the part of the application we're writing is what builds the database), so I would much rather build a solution right into the table, if at all possible. Thanks for the idea, though!
Tinister
+1  A: 

Since you have parsing technology, you can re-write the expressions to trap all the potential issues with a bunch of CASE statements - divide by zero, overflows, upcasting to different types, etc.

But I wouldn't do that, because I don't think putting these into the database as part of the schema is a great idea, unless this tool is a code-generation tool nd users are responsible for reviewing and testing the expressions for edge conditions just as if they were included in the original database design.

If you are already parsing the expression, I would compile this at the client side, too, and handle the error on a row-by-row basis.

Catching the exception using TRY/CATCH is an all or nothing scenario, not on a row-by-row basis.

Cade Roux
Hrm. What do you think of compiling it into an AFTER UPDATE trigger instead? Would that be a legitimate use for a trigger in your opinion?
Tinister
Nope. Why is putting this into the database a good idea? I don't want anyone doing a SELECT * anyway. I don't even let users do SELECT if I can avoid it and usually not on a TABLE (maybe a VIEW), since they only have EXEC rights on certain user-exposed SPs. When users are allowed to do arbitrary calculations (and change those definitions on a regular basis), I put that responsibility in a report writer, or put the formula in a configuration space where it is interpreted by the application, where changes to it can be caught with an audit trail.
Cade Roux
Making it part of the database schema means that different customers would have different schemas - a complete nightmare to keep in sync, test and manage.
Cade Roux
I definitely agree with you there, but our application *is* a schema creation tool (in a very broad sense anyway), so that can of worms is already open. And the reason we need to allow a `SELECT *` is that's how the third-party software we OEM'ed for use on our customers' client machines works. Hence, this question.
Tinister
That being said, we took a closer look at that software and having a `SELECT *` throwing an exception isn't as devastating as we originally thought, so we decided to let the exception bubble and handle support accordingly.
Tinister
I'm convinced that the answer I was looking for just doesn't exist, so I'll mark this as the answer for the good advice.
Tinister
OK, I know you're in a tough spot. Building a database within a database is always hard. I'm not sure when it will get easier. It seems like there might be a market for a database compiler product which still produces a SQL database, but with your own extensions for things like EAV and complex derived columns. I'll add another answer regarding UDFs (which I think is complete FAIL for large datasets, which is why I didn't suggest it.).
Cade Roux
+1  A: 

I hate to see you accept an answer which doesn't really get you closer to your goal.

As a separate answer which might help you out a little, you can make computed columns call a (deterministic) scalar UDF.

See, for instance, here

So if you are going to create a computed column, make it pass the columns to a generated UDF (or several UDFs) and do the work there. In a scalar UDF, you can have plenty of code to catch the problems but you still cannot use TRY/CATCH. What you can do in your scalar UDF is catch cases and return appropriate answers (bubbling up NULLs, probably).

But performance will be very poor on scalar UDFs (not sure about non-persisted inline computed columns vs. UDFs, we mainly use persisted), so you might seriously want to consider making the column persisted, which will then use space in the database and make inserts and updates a little slower. That's a big tradeoff.

Cade Roux
A: 

If performance is important, use a trigger or indexed view to store the calculations when the data is inserted, updated or deleted.

If performance is not important, use a scalar valued function.

Brad