ansaurus

Question

How can I make $1 return alternatives without a substitution regex?

Answer 1

A:

I don't fully understand your constraints. Are you limited to supplying a single regex that will always by processed using the code in your first excerpt? If so, you cannot do what you are trying to do. You are trying to extract two separate parts of the entry string, you simply can't return 2 values in a single scalar return value unless you can add the code to concatenate them.

Can you add perl code at all? For example, can you define the logic to be:

if ( $entry =~ /$regex/ ) { $req_value = '$1 $2'; }

where your $regex = qr/((\d.*?)\s+(?:.*)?(SMP)/; ?

Baring the ability to define some new perl code, you can't accomplish this.

Regarding part two, substiutions. I interpret your question to ask if you can compile both the PATTERN and REPLACEMENT parts of s/PATTERN/REPLACEMENT/ into a single qr//. If so, you cannot. qr// only compiles a matching pattern, and a qr variable can only be used in the PATTERN portion of a REPLACEMENT. In other words, to use s///, you'll need to write perl code that runs s///. I'm guessing that if you could write new perl code, you'd use the above solution.

One more thought: In your current architecture, can you define fields in terms of of other fields? In other words, could you extract the version string with one regex, the SMP string with another regex, and define a third field that combines the two?

2010-01-22 17:05:39

@Jason Clark: I can add code, but this means not using this method for this particular element. { $req_value = '$1 $2'; } won't work because other regexes expect $1. The second part of my regex takes care of where SMP does not exist. The idea of defining fields in terms of other fields could work, but is unwieldy and could also confuse other developers after me, don't you think? You're right about s///. I tried it by splitting a (regex) string into 2 parts, but perl barfed, because qr// must hold a valid regex which the second part cannot be, not to mention the switch and modifiers. Ugh!

Dee 2010-01-22 18:01:51

@ all readers: If something cannot be done, then it simply cannot be done. I just thought there may be a solution, but it may be there's none, given the architecture?

Dee 2010-01-22 18:05:30

Answer 2

+1 A:

You're stuck unless you can talk the folks who control the code you're using into generalizing it somehow. The good news is you need only a bit more, perhaps

if (my @fields = $_ =~ /$pat/) {
  $req_value = join " " => grep defined($_), @fields;
}

This works because a successful regular-expression match in list context returns all captured substrings, i.e., $1, $2, $3, and so on as appropriate.

With a single pattern,

qr/(\d+(?:[-.]\w+)*)(?:.*(SMP))?/

the code above yields 2.6.9-78.1.6.ELsmp SMP and 2.6.9-78.0.5.ELsmp in $req_value. The grep defined($_) filters out captures for subpatterns not taken. Without it, you get undefined value warnings for the non-SMP case.

The downside is every regular expression would need to be reviewed to be sure that all capturing groups really ought to go in $req_value. For example, say someone is using the pattern

qr/(XYZ) OS (version \d+|v-\d+)/

As it is now, only XYZ would go into $req_value, but using the above generalization would also include the version number. If that's undesired, the regular expression should be

qr/(XYZ) OS (?:version \d+|v-\d+)/

because (?:...) does not capture (that is, it does not produce a $2 for the pattern above): it's for grouping only.

Greg Bacon 2010-01-22 18:09:43

You and Jason Clark are right. I think the best solution is to write a new method. The answer it seems cannot be found in this one. But in your answer above, how would you deliver the parenthesized SMP '(SMP)'? You still need some further processing after obtaining the matches.

Dee 2010-01-22 18:21:52

The generalization would handle it. I added clarification in the updated answer.

Greg Bacon 2010-01-22 18:44:03

This is the sort of thing I might solve with beer and pizza for the right people. Seriously.

brian d foy 2010-01-22 19:02:24

@ gbacon: I like it, TMTOWTDI.

Dee 2010-01-22 19:17:16

@ brian d foy: May I declare myself one of the right people? I AM one of the right people! I have pizza and beer. How can I get it to you? Are you currently in Europe perchance? :o)

Dee 2010-01-22 19:22:15

Not for brian: for the people in charge of the inflexible code you'd like changed!

Greg Bacon 2010-01-22 19:31:14

Ah gbacon! With this comment, I can't refer them to this discussion for their benefit.

Dee 2010-01-25 09:14:36

@ gbacon: You still missed out parenthesizing the SMP. That's got to require processing outside the regular expression to achieve hasn't it? In any case, I found a number of regexes in the database that will need changing for this solution, so I don't think they'll go for it. I'm also exploring brian's suggestions of moving the processing (?{...}) and checking Regex::Grammars. I will revert when I have a solution. Bless...

Dee 2010-01-25 10:26:40

Answer 3

A:

As of 5.10.0, (?|pattern) is available to allow alternatives to use the same capture numbering. As you pointed out that you're still using 5.8, this may not be useful directly but perhaps as further incentive to your project to start moving to a modern Perl.

masto 2010-02-07 00:51:42

Agreed, but the client is a storage coy and deliver hardware with certain configurations, so I can't really affect this. Named backreferences in 5.10 would have solved this problem instantly.:-)

Dee 2010-02-18 10:31:46

ansaurus

tags:

views:

answers:

How can I make $1 return alternatives without a substitution regex?

related questions