tags:

views:

123

answers:

4

Hi, I am learning the sample code from split function.

Sample code.

#!C:\Perl\bin\perl.exe
use strict;
use warnings;

my $info = "Caine:Michael:Actor:14, Leafy Drive";
my @personal = split(/:/, $info);
# @personal = ("Caine", "Michael", "Actor", "14, Leafy Drive");

If change the $info = "Caine Michael Actor /* info data */"; How to use the split(/ /, $info) to export the result below.

# @personal = ("Caine", "Michael", "Actor", "info data");

Thank you.

A: 

Cooked something up :). Does work only for you example. Cannot generalize

use strict;
use warnings;

my $info = "Caine Michael Actor /* info data */";
if($info=~m{/\*\s*(.*?)\s*\*/})
{
    my $temp = $1;
    $temp=~s{\s+}{##}g;
    $info=~s{/\*\s*(.*?)\s*\*/}{$temp};
}
my @personal = split(/ /, $info);
foreach(@personal)
{
    s{##}{ }g;
    print "$_\n";
}

Output:

C:>perl a.pl
Caine
Michael
Actor
info data

codaddict
@codadict, Thanks a lot for your details reply. I found it's a solution for my case. It's MAGIC.
Nano HE
+2  A: 

It really is better to use regex for this:

$info = "Caine Michael Actor /* info data */";
$info =~ /(\w+)\s+(\w+)\s+(\w+).*\/\*(.+)\*\//;
@personal = ($1, $2, $3, $4);

Mainly because your input string has ambiguities related to word separators not easily handled by split.

In case you're wondering how to read the regex:

/
    (\w+)   # CAPTURE a sequence of one of more word characters into $1
    \s+     # MATCH one or more white space
    (\w+)   # CAPTURE a sequence of one of more word characters into $2
    \s+     # MATCH one or more white space
    (\w+)   # CAPTURE a sequence of one of more word characters into $3
    .*      # MATCH zero or more of anything
    \/\*    # MATCH the opening of C-like comment /*
    (.+)    # CAPTURE a sequence of one or more of anything into $4
    \*\/    # MATCH the closing of C-like comment */
/x
slebetman
Avoid the leaning toothpick syndrome by using a different delimiter and assign the match to `@personal`. Don't forget to check if `@personal` was populated. `if ( @personal =~ m!...! )`. You should also anchor the pattern.
Sinan Ünür
You don't really want to match \w+ there. You don't care what the characters are as long as they aren't whitespace (that is, you don't care if they are Perl identifier characters), so you should match \S+
brian d foy
Better would be `if (@personal = $info =~ /.../) { ... }`. **Never use `$1` and friends unconditionally!**
Greg Bacon
+4  A: 

Alternative approach:

Have you considered using the 3-parameter version of split:

$info = "Caine Michael Actor /* info data */";
@personal= split(' ',$info,4);

resulting in

@personal=('Caine','Michael','Actor','/* info data */');

then you would have to remove / * * / .. to get your result...

lexu
sigh, I can't get the slash-asterisk and asterisk-slash to show up..
lexu
Hi Lexu, Thank you for your reply. I never considered using the 3-parameter version of split before. You teached me more about split().
Nano HE
+1  A: 

since there isn't an answer yet that handles the general case, here goes:

split isn't your best bet here, and since the delimiter can be both a matched and non matched character, it will be clearest to invert the problem and describe what you do what to match, which in this case is either a string of non space characters, or the contents of a c style comment.

use strict;
use warnings;

my $info = "Caine Michael Actor /* info data */";
my @personal = grep {defined} $info =~ m! /\* \s* (.+?) \s* \*/ | (\S+) !xg;

say join ', ' => @personal;

that will return a list of words / contents of comments in any sequence you need. The syntax highlighter doesn't highlight the above regex properly, the regex is everything between !

Eric Strom