views:

148

answers:

5

If I had a text file with the following:

    Today (is|will be) a (great|good|nice) day.

Is there a simple way I can generate a random output like:

    Today is a great day.
    Today will be a nice day.

Using Perl or UNIX utils?

+3  A: 
  1. Use a regex to match each parenthetical (and the text inside it).
  2. Use a string split operation (pipe delimiter) on the text inside of the matched parenthetical to get each of the options.
  3. Pick one randomly.
  4. Return it as the replacement for that capture.
Amber
+7  A: 

Code:

#!/usr/bin/perl

use strict;
use warnings;

my $template = 'Today (is|will be) a (great|good|nice) day.';

for (1..10) {
    print pick_one($template), "\n";
}

exit;

sub pick_one {
    my ($template) = @_;
    $template =~ s{\(([^)]+)\)}{get_random_part($1)}ge;
    return $template;
}

sub get_random_part {
    my $string = shift;
    my @parts = split /\|/, $string;
    return $parts[rand @parts];
}

Logic:

  1. Define template of output (my $template = ...)
  2. Enter loop to print random output many times (for ...)
  3. Call pick_one to do the work
  4. Find all "(...)" substrings, and replace them with random part ($template =~ s...)
  5. Print generated string

Getting random part is simple:

  1. receive extracted substring (my $string = shift)
  2. split it using | character (my @parts = ...)
  3. return random part (return $parts[...)

That's basically all. Instead of using function you could put the same logic in s{}{}, but it would be a bit less readable:

$template =~  s{\( ( [^)]+ ) \)}
               { my @parts = split /\|/, $1;
                 $parts[rand @parts];
               }gex;
depesz
+1 It seems like posted the exact same answer without seeing yours. I'll delete mine. By the way, no need for `scalar` because `rand` evaluates its first argument in scalar context.
Sinan Ünür
I know about it, but I prefer to insert it (scalar) to be sure that it's 100% readable and not ambiguous.
depesz
@depesz Got it. I am the opposite.
Sinan Ünür
Great! Even I can follow this!
DBMarcos99
+1 Quick and simple. And I don't know perl, so thanks for the explanation too :)
Lin
+2  A: 

Smells like a recursive algorithm

Edit: misread and thought you wanted all possibilities

#!/usr/bin/python
import re, random

def expand(line, all):
    result = re.search('\([^\)]+\)', line)
    if result:
        variants = result.group(0)[1:-1].split("|")
        for v in variants:
            expand(line[:result.start()] + v + line[result.end():], all)
    else:
        all.append(line)
    return all

line = "Today (is|will be) a (great|good|nice) day."

all = expand(line, [])

# choose a random possibility at the end:
print random.choice(all)

A similar construct that produces a single random line:

def expand_rnd(line):
    result = re.search('\([^\)]+\)', line)
    if result:
        variants = result.group(0)[1:-1].split("|")
        choice = random.choice(variants)
        return expand_rnd(
                line[:result.start()] + choice + line[result.end():])
    else:
        return line

Will fail however on nested constructs

Otto Allmendinger
Come on guys, just because it isn't perl? It's the algorithm that is interesting
Otto Allmendinger
I didn't downvote, but you are right, the OP asked for any *nix solution, so I'll give you my vote.
Leonardo Herrera
+1 for working Python solution. I do most my work in Python these days, so this is very helpful for me. :)
Lin
+9  A: 

Closures are fun:

#!/usr/bin/perl

use strict;
use warnings;

my @gens = map { make_generator($_, qr~\|~) } (
    'Today (is|will be) a (great|good|nice) day.',
    'The returns this (month|quarter|year) will be (1%|5%|10%).',
    'Must escape %% signs here, but not here (%|@).'
);

for ( 1 .. 5 ) {
    print $_->(), "\n" for @gens;
}

sub make_generator {
    my ($tmpl, $sep) = @_;
    my @lists;

    while ( $tmpl =~ s{\( ( [^)]+ ) \)}{%s}x ) {
        push @lists, [ split $sep, $1 ];
    }

    return sub {
        sprintf $tmpl, map { $_->[rand @$_] } @lists
    };
}

Output:

C:\Temp> h
Today will be a great day.
The returns this month will be 1%.
Must escape % signs here, but not here @.
Today will be a great day.
The returns this year will be 5%.
Must escape % signs here, but not here @.
Today will be a good day.
The returns this quarter will be 10%.
Must escape % signs here, but not here %.
Today is a good day.
The returns this month will be 1%.
Must escape % signs here, but not here %.
Today is a great day.
The returns this quarter will be 5%.
Must escape % signs here, but not here @.
Sinan Ünür
This is a nice answer, thought I don't think anybody who hasn't read mjd's book understands it...
Leonardo Herrera
*Higher Order Perl* (which is now available for free: http://hop.perl.plover.com/ -- I did pay for it and it was worth every penny) is the work of a genius. I was just trying to improve on my first answer ;-)
Sinan Ünür
+1 I can't follow this, but it's great for helping me learn Perl. Thanks!
Lin
+7  A: 

Sounds like you may be looking for Regexp::Genex. From the module's synopsis:

#!/usr/bin/perl -l

use Regexp::Genex qw(:all);

$regex = shift || "a(b|c)d{2,4}?";

print "Trying: $regex";
print for strings($regex);
# abdd
# abddd
# abdddd
# acdd
# acddd
# acdddd
Dave Sherohman
Well, this doesn't work too well. "(is|will be)" works, but "Today (is|will be)" fails immediately. So you still need to identify the "()" parts and process them, which made using this module pretty much pointless.
Leonardo Herrera