views:

546

answers:

5

Hi I was wondering if there is any known way to get rid of unnecessary parentheses in mathematical formula. The reason I am asking this question is that I have to minimize such formula length

if((-if(([V].[6432])=0;0;(([V].[6432])-([V].[6445]))*(((([V].[6443]))/1000*([V].[6448])
+(([V].[6443]))*([V].[6449])+([V].[6450]))*(1-([V].[6446])))))=0;([V].[6428])*
((((([V].[6443]))/1000*([V].[6445])*([V].[6448])+(([V].[6443]))*([V].[6445])*
([V].[6449])+([V].[6445])*([V].[6450])))*(1-([V].[6446])));

it is basically part of sql select statement. It cannot surpass 255 characters and I cannot modify the code that produces this formula (basically a black box ;) ) As you see many parentheses are useless. Not mentioning the fact that:

((a) * (b)) + (c) = a * b + c

So I want to keep the order of operations Parenthesis, Multiply/Divide, Add/Subtract.

Im working in VB, but solution in any language will be fine.


The numbers 6432, 6445 are field names. I've already found a solution to change these numbers to different base (using ASCII characters) so that.

6432 = "37*"
6428 = "37$"
and so on....

It's always one char less ;)

Public Function ToBaseN(n As Variant, base As Integer) As String
    Dim numerals As String
    numerals = "0123456789abcdefghijklmnopqrstuvwxyz@#$%^&*()"
    If n = 0 Then
        ToBaseN = "0"
    Else
        ToBaseN = lstrip(ToBaseN(Int(n / base), base), "0") & Mid(numerals, (n Mod base) + 1, 1)
    End If
End Function

Public Function lstrip(s As String, strip As String)
    If Left(s, Len(strip)) = strip Then
        lstrip = Mid(s, Len(strip) + 1)
    Else
        lstrip = s
    End If
End Function

Edit

I found an opposite problem (add parentheses to a expression) Question.

I really thought that this could be accomplished without heavy parsing. But it seems that some parser that will go through the expression and save it in a expression tree is unevitable.

A: 

I'm pretty sure that in order to determine what parentheses are unnecessary, you have to evaluate the expressions within them. Because you can nest parentheses, this is is the sort of recursive problem that a regular expression can only address in a shallow manner, and most likely to incorrect results. If you're already evaluating the expression, maybe you'd like to simplify the formula if possible. This also gets kind of tricky, and in some approaches uses techniques that that are also seen in machine learning, such as you might see in the following paper: http://portal.acm.org/citation.cfm?id=1005298

Robert Elwell
A: 

If your variable names don't change significantly from 1 query to the next, you could try a series of replace() commands. i.e.

X=replace([QryString],"(([V].[6443]))","[V].[6443]")

Also, why can't it surpass 255 characters? If you are storing this as a string field in an Access table, then you could try putting half the expression in 1 field and the second half in another.

PowerUser
A: 

You could also try parsing your expression using ANTLR, yacc or similar and create a parse tree. These trees usually optimize parentheses away. Then you would just have to create expression back from tree (without parentheses obviously).

It might take you more than a few hours to get this working though. But expression parsing is usually the first example on generic parsing, so you might be able to take a sample and modify it to your needs.

bh213
+1  A: 

If you are interested in remove the non-necessary parenthesis in your expression, the generic solution consists in parsing your text and build the associated expression tree.

Then, from this tree, you can find the corresponding text without non-necessary parenthesis, by applying some rules:

  • if the node is a "+", no parenthesis are required
  • if the node is a "*", then parenthesis are required for left(right) child only if the left(right) child is a "+"
  • the same apply for "/"

But if your problem is just to deal with these 255 characters, you can probably just use intermediate variables to store intermediate results

T1 = (([V].[6432])-([V].[6445]))*(((([V].[6443]))/1000*([V].[6448])+(([V].[6443]))*([V].[6449])+([V].[6450]))*(1-([V].[6446])))))
T2 = etc...
ThibThib
+2  A: 

You could strip the simplest cases:

([V].[6432]) and (([V].[6443]))

Becomes

v.[6432]

You shouldn't need the [] around the table name or its alias.

You could shorten it further if you can alias the columns:

select v.[6432] as a, v.[6443] as b, ....

Or even put all the tables being queried into a single subquery - then you wouldn't need the table prefix:

if((-if(a=0;0;(a-b)*((c/1000*d
+c*e+f)*(1-g))))=0;h*
(((c/1000*b*d+c*b*
e+b*f))*(1-g));

select [V].[6432] as a, [V].[6445] as b, [V].[6443] as c, [V].[6448] as d, 
    [V].[6449] as e, [V].[6450] as f,[V].[6446] as g, [V].[6428] as h ...

Obviously this is all a bit psedo-code, but it should help you simplify the full statement

Keith
Thank you Keith. I will give it a go.
Pawel