tags:

views:

194

answers:

4

I'm in the processing of converting a program from Perl to Java. I have come across the line

my ($title) = ($info{$host} =~ /^\s*\(([^\)]+)\)\s*$/);

I'm not very good with regular expressions but from what I can tell this is matching something in the string $info{$host} to the regular expression ^\s*(([^)]+))\s*$ and assigning the match to $title.

My problem is that I have no clue what the regular expression is doing and what it will match. Any help would be appreciated.

Thanks

+4  A: 

The regular expression matches a string that contains exactly one pair of matching parentheses (actually, one opening and one matching closing parenthesis, but inside any number of further opening parentheses may occur).

The string may begin and end with whitespace characters, but no others. Inside the parantheses, however, arbitrary characters may occur (at least one).

The following strings should match it:

 (abc)
 (()
   (ab)

By the way, you may simply use the regular expression as-is in Java (after escaping the backslashes), using the Pattern class.

Konrad Rudolph
Ok, So (some string) is match but (some ) string) isn't.Thanks.
Android
Now I know what it does I have found that it is redundant. With all the possibilities of input the same can be done with trim()
Android
+4  A: 

It will match a bunch of leading whitespace, followed by a left paren, followed by some text not including a right paren, followed by a right paren, followed by some more whitespace.

Matches:

      (some stuff)

Fails:

 (some stuff

     some stuff)

   (some stuff)  asadsad
... and stuff inside parens (not including parens) is retuned to $title variable
Juha Syrjälä
+1  A: 

Ok step by step

/ - quote the regex

^ - the begining of the string

\s* - zero or more of any spacelike character

( - an actual ( character

( - begin a capture group

[^)]+ any of the characters ^ or ) the + indicating at least one

) -end the capture group

) and actual ) character

\s* zero or more space like characters

$ - the end of the string

/ - close the regex quote

So as far as I can work out we are looking for strings like " (^) " or "())" methinks I am missing something here.

James Anderson
[^)] means any but )
larelogio
Thanks -- you (re-)learn something new every day.
James Anderson
A: 
my ($title) = ($info{$host} =~ /^\s*\(([^\)]+)\)\s*$/);

First, m// in list context returns the captured matches. my ($title) puts the right hand side in list context. Second, $info{$host} is matched against the following pattern:

/^ \s* \( ( [^\)]+) \) \s* $/x

Yes, used the x flag so I could insert some spaces. ^\s* skips any leading whitespace. Then we have an escaped paranthesis (therefore no capture group is created. Then we have a capture group containing [^\)]. That character class can be better written as [^)] because the right parenthesis is not special in a character class and means anything but a left parenthesis.

If there are one or more characters other than a closing parenthesis following the opening parenthesis followed by a closing parenthesis optionally surrounded on either side by whitespace, that sequence of characters is captured and put in to $x.

Sinan Ünür