views:

109

answers:

1

Hello my second family, :)

I'm just wondering how to apply several rules for a preg_replace without executing them in the first run. Its a bit complicated let me explain based on an example.

Input:

$string = 'The quick brown fox jumps over the lazy freaky dog'; 

Rules:

  • Replace a, i, o with u (if not at the beginning of a word & if not before/after a vowel)
  • Replace e, u with i (if not at the beginning of a word & if not before/after a vowel)
  • Replace ea with i (if not at beginning of a word)
  • Replace whole words ie dog with cat and fox with wolf (without applying the rules above)

Output:

Thi quick bruwn wolf jimps over thi luzy friky cat




I started with something like that: (Edited thanks to Ezequiel Muns)

$patterns = array();
$replacements = array();

$patterns[] = "/(?<!\b|[aeiou])[aio](?![aeiou])/";
$replacements[] = "u";

$patterns[] = "/(?<!\b|[aeiou])[eu](?![aeiou])/";
$replacements[] = "i";

$patterns[] = '/ea/';
$replacements[1] = 'i';

$patterns[] = '/dog/';
$replacements[0] = 'cat';

echo preg_replace($patterns, $replacements, $string);

Output:

Thi qiick briwn fix jimps ivir thi lizy friiky dig



Edited:

As you can see the problem is that every rule gets overwritten by the previous rule.

Example 'fox':

  1. rule: turns fox into fux
  2. rule: turns fux into fix

Is there a way to avoid the following rule(s) if the character was already been effected by the previous rule?

Does this makes sense?

+1  A: 

First you need to be explicit about the replacement conditions, your rules say 'not at the begining of a word and not before/after a vowel' but you have not implemented that in the regex. You can do this using Negative Lookahead/Lookbehind. For example:

  1. Replace a, i, o with u (if not at the beginning of a word & if not before/after a vowel)

Can be implemented with:

$patterns[] = "/(?<!\b|[aeiou])[aio](?![aeiou])/";
$replacements[] = "u";

This method can be used to implement the first 3 rules.

The next problem is that 'fox' and 'dog' will be affected by the first 3 rules, so you should replace the changed version to 'wolf' and 'cat'. So for dog => cat:

$patterns[] = "/\bdug\b/";
$replacements[] = "cat";

Note: Because of the way preg_replace works with arrays, it's much better to not use indexes in the $patterns and $replacements arrays, since these can be misleading. Use the [] operator in pairs like I did above, so you always know what goes with what.

Part 2:

Aha. I see. You need to make the replacement exlusive.

You could use a regex that matches both the first cases, which are the problematic ones. Then you can use an interesting weird feature of preg_replace: When you add the e modifier, the replace string is instead evaluated as PHP code. Combining this with capturing groups, it will allow you to decide whether to output a u or an i according to what you matched.

$patterns[] = "/(?<!\b|[aeiou])([aeiou])(?![aeiou])/e";
$replacements[] = '("$1" == "e" || "$1" == "u")? "i":"u"';

*Note the /e and the () around the vowel matching class.

Ezequiel Muns
Thanks dude, that works like charm. :)But please see my edited post above.
Mayko
Ohh and another thing, you need to make sure you also use lookbehind in your `/ea/` because you said *not at the beginning of a word*.
Ezequiel Muns
You are amazing! Thank you so much
Mayko