views:

1526

answers:

2

Hi, I'm learning Scala, so this is probably pretty noob-irific.

I want to to a multiline regular expression.

In Ruby it would be:

MY_REGEX = /com:Node/m

My Scala looks like:

val ScriptNode =  new Regex("""<com:Node>""")

Here's my match function:

def matchNode( value : String ) : Boolean = value match 
{
    case ScriptNode() => System.out.println( "found" + value ); true
    case _ => System.out.println("not found: " + value ) ; false
}

And I'm calling it like so:

matchNode( "<root>\n<com:Node>\n</root>" ) // doesn't work
matchNode( "<com:Node>" ) // works

I've tried:

val ScriptNode =  new Regex("""<com:Node>?m""")

And I'd really like to avoid having to use java.util.regex.Pattern. Any tips greatly appreciated.

+7  A: 

This is a very common problem when first using Scala Regex.

When you use pattern matching in Scala, it tries to match the whole string, as if you were using "^" and "$" (and did not activate multi-line parsing, which matches \n to ^ and $).

The way to do what you want would be one of the following:

def matchNode( value : String ) : Boolean = 
  (ScriptNode findFirstIn value) match {    
    case Some(v) => System.out.println( "found" + v ); true    
    case None => System.out.println("not found: " + value ) ; false
  }

Which would find find the first instance of ScriptNode inside value, and return that instance as v (if you want the whole string, just print value). Or else:

val ScriptNode =  new Regex("""(?s).*<com:Node>.*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode() => System.out.println( "found" + value ); true    
    case _ => System.out.println("not found: " + value ) ; false
  }

Which would print all all value. In this example, (?s) activates dotall matching (ie, matching "." to new lines), and the .* before and after the searched-for pattern ensures it will match any string. If you wanted "v" as in the first example, you could do this:

val ScriptNode =  new Regex("""(?s).*(<com:Node>).*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode(v) => System.out.println( "found" + v ); true    
    case _ => System.out.println("not found: " + value ) ; false
  }
Daniel
Lovely stuff. Cheers!
ed
+2  A: 

Just a quick and dirty addendum: the .r method on RichString converts all strings to scala.util.matching.Regex, so you can do something like this:

"""(?s)a.*b""".r replaceAllIn ( "a\nb\nc\n", "A\nB" )

And that will return

A
B
c

I use this all the time for quick and dirty regex-scripting in the scala console.

Or in this case:

def matchNode( value : String ) : Boolean = {

    """(?s).*(<com:Node>).*""".r.findAllIn( text ) match {

       case ScriptNode(v) => System.out.println( "found" + v ); true    

       case _ => System.out.println("not found: " + value ) ; false
    }
}

Just my attempt to reduce the use of the word new in code worldwide. ;)

Tristan Juricek