views:

873

answers:

4

I have a mathparser that can do functions like "IntPow(3,2)". If a user pastes "1,000,000" and then adds a plus symbol, making the full equation "1,000,000+IntPow(3,2)" the parser fails because it does not work with numbers that contain commas.

I need to remove the commas from the "1,000,000", but not from the "IntPow(3,2)" because IntPow has two parameters separated by a comma. The final equation will be "1000000+IntPow(3,2)". The equation is stored in one string. How would I remove only the commas that are outside of parenthesis? I'm assuming and saying that numbers that contain commas will not be placed inside the IntPow parameters list.

When I say "remove commas" I really mean remove "CultureInfo.CurrentCulture.NumberFormat.NumberGroupSeparator" which could be a comma or a period depending on the local. This part will be easy because I assume regex will be used and I can just concatentate that value in the regex comma place.

I have this regex: (.*?) for finding the parenthesis and values inside of them but I'm not sure how to only remove the commas outside of the regex matches.

Any help would be greatly appreciated.

Thank you,

John

+2  A: 

But what if a user pastes:

1,000+IntPow(3,000,2,000)

Now the 3,000 is between comma's.

Andomar
This presumably will not work because the IntPow() function wouldn't accept four parameters.
womp
The parser would bork on the 1,000 before it reaches the IntPow function. A number inside parenthesis could be in comma digit grouping format too.
Andomar
This was mentioned in the question: "I'm assuming and saying that numbers that contain commas will not be placed inside the IntPow parameters list.". I actually meant that if this is the case, then it will not calculate and I'm okay with that.
John Rennemeyer
+5  A: 

The easiest way is to not try and make a regex do this. Just loop over the string one character at a time. If you read a '(', increment a counter. If you read a ')', decrement that counter. If you read a comma, delete it if the counter is 0, otherwise leave it alone.

Chad Birch
+1  A: 
Sub Main()

 '
 ' remove Commas from a string containing expression-like syntax
 '  (eg.  1,000,000 + IntPow(3,2) - 47 * Greep(9,3,2) $ 5,000.32 )
 '  should become:  1000000 + IntPow(3,2) - 47 * Greep(9,3,2) $ 5000.32
 '

 Dim tInput As String = "1,000,000 + IntPow(3,2) - 47 * Greep(9,3,2) $ 5,000.32"
 Dim tChar As Char = Nothing
 Dim tResult As StringBuilder = New StringBuilder(tInput.Length)
 Dim tLevel As Integer = 0

 For Each tChar In tInput
  Select Case tChar
   Case "("
    tLevel += 1
    tResult.Append(tChar)

   Case ")"
    tLevel -= 1
    tResult.Append(tChar)

   Case ","   '  Change this to your separator character.
    If 0 < tLevel Then
     tResult.Append(tChar)
    End If

   Case Else
    tResult.Append(tChar)

  End Select
 Next

 Console.ForegroundColor = ConsoleColor.Cyan
 Console.WriteLine(tInput)
 Console.WriteLine(String.Empty)
 Console.ForegroundColor = ConsoleColor.Yellow
 Console.WriteLine(tResult.ToString)
 Console.WriteLine()
 Console.ResetColor()
 Console.WriteLine(" -- PRESS ANY KEY -- ")
 Console.ReadKey(True)

End Sub
Boo
Chad Birch had the answer and you provided the example code. Too bad I can't select both. In this case, I'll have to go with the explanation as the best answer as it explains how it is done (the code does too, but words are more generic). I'll still vote this answer up a little.
John Rennemeyer
+1  A: 

I don't think this is possible using regular expressions. Distinguishing between inside and outside parenthesis is not a regular language. It is a context non-sensitive language that can’t be decided using a regular state machine (expression). You need a stack machine (i.e. something link the decided by Chad)

Mouk