views:

221

answers:

5

I need to allow my users to be able to define formulas which will calculate values based on data. For example

//Example 1
return GetMonetaryAmountFromDatabase("Amount due") * 1.2;
//Example 2
return GetMonetaryAmountFromDatabase("Amount due") * GetFactorFromDatabase("Discount");

I will need to allow / * + - operations, also to assign local variables and execute IF statements, like so

var amountDue = GetMonetaryAmountFromDatabase("Amount due");
if (amountDue > 100000) return amountDue * 0.75;
if (amountDue > 50000) return amountDue * 0.9;
return amountDue;

The scenario is complicated because I have the following structure..

  1. Customer (a few hundred)
  2. Configuration (about 10 per customer)
  3. Item (about 10,000 per customer configuration)

So I will perform a 3 level loop. At each "Configuration" level I will start a DB transaction and compile the forumlas, each "Item" will use the same transaction + compiled formulas (there are about 20 formulas per configuration, each item will use all of them).

This further complicates things because I can't just use the compiler services as it would result in continued memory usage growth. I can't use a new AppDomain per each "Configuration" loop level because some of the references I need to pass cannot be marshalled.

Any suggestions?

--Update-- This is what I went with, thanks! http://blog.peterlesliemorris.com/archive/2010/05/19/embedding-ironpython-into-a-c-application.aspx

+2  A: 

Iron Python Allows you to embed a scripting engine into your application. There are many other solutions. In fact, you can google something like "C# embedded scripting" and find a whole bunch of options. Some are easier than others to integrate, and some are easier than others to code up the scripts.

Of course, there is always VBA. But that's just downright ugly.

dviljoen
Here's a better link:http://www.voidspace.org.uk/ironpython/embedding.shtml
dviljoen
Found another that looks interesting:http://www.csscript.net/
dviljoen
A: 

You could create a simple class at runtime, just by writing your logic into a swting or the like, compile it, run it and make it return the calculations you need. This article shows you how to access the compiler from runtime: http://www.codeproject.com/KB/cs/codecompilation.aspx

H4mm3rHead
As I said in the question text, I can't use the compiler service because it creates an assembly I then cannot unload, and I cannot create a new app domain because .net cannot marshal the types I need to pass to it.
Peter Morris
A: 

You could build two base classes UnaryOperator (if, square, root...) and BinaryOperator (+ - / *) and build a tree from the expression. Then evaluate the tree for each item.

Sylvestre Equy
+1  A: 

I faced a similar problem a few years ago. I had a web app with moderate traffic that needed to allow equations, and it needed similar features to yours, and it had to be fast. I went through several ideas.

The first solution involved adding calculated columns to our database. Our tables for the app store the properties in columns (e.g., there's a column for Amount Due, another Discount, etc.). If the user typed in a formula like PropertyA * 2, the code would alter the underlying table to have a new calculated column. It's messy as far as adding and removing columns. It does have a few advantages though: the database (SQL Server) was really fast at doing the calculations; the database handled a lot of error detection for us; and I could pretend that the calculated values were the same as the non-calculated values, which meant that I didn't have to modify any existing code that worked with the non-calculated values.

That worked for a while until we needed the ability for a formula to reference another formula, and SQL Server doesn't allow that. So I switched to a scripting engine. IronPython wasn't very mature back then, so I chose another engine... I can't remember which one right now. Anyway, it was easy to write, but it was a little slow. Not a lot, maybe a few milliseconds per query, but for a web app the time really added up over all the requests.

That was when I decided to write my own parser for the formulas. That is, I have a PlusToken class to add two values, an ItemToken class that corresponds to GetValue("Discount"), etc. When the user enters a new formula, a validator parses the formula, makes sure it's valid (things like, did they reference a column that doesn't exist?), and stores it in a semi-compiled form that's easy to parse later. When the user requests a calculated value, a parser reads the formula, parses it, figures out what data is needed from the database, and computes the final answer. It took a fair amount of work up front, but it works well and it's really fast. Here's what I learned:

  1. If the user enters a formula that leads to a cycle in the formulas, and you try to compute the value of the formula, you'll run out of stack space. If you're running this on a web app, the entire web server will stop working until you reset it. So it's important to detect cycles at the validation stage.
  2. If you have more than a couple formulas, aggregate all the database calls in one place, then request all the data at once. Much faster.
  3. Users will enter wacky stuff into formulas. A parser that provides useful error messages will save a lot of headaches later on.
Paul Shealy
+1  A: 

If the custom scripts don't get more complex than the ones that you show above, I would agree with Sylvestre: Create your own parser, make a tree and do the logic yourself. You can generate a .Net expression tree or just go through the Syntax tree yourself and make the operations within your own code (Antlr below will help you generate such code).

Then you are in complete control of your references, you are always within C#, so you don't need to worry about memory management (any more than you would normally do) etc. IMO Antlr is the best tool for doing this in C# . You get examples from the site for little languages, like your scenario.

But... if this is really just a beginning and at the end you need almost full power of a proper scripting language, you would need to go into embedding a scripting language to your system. With your numbers, you will have a problem with performance, memory management and probably references as you noted. There are several approaches, but I cannot really give one recommendation for your scenario: I've never done it in such a scale.

Ope