views:

74

answers:

4

How can I replace characters with preg_replace, that are enclosed in quotes. I need to replace all special characters, that are in href="" things. example:

<a href="ööbik">ööbik</a> should become <a href="oobik">ööbik</a>
A: 

Have you considered using an HTML parser instead?

Amber
like what kind of parser? can you point me in the right direction please? :)
jevgeni
Something like this, potentially: http://simplehtmldom.sourceforge.net/
Amber
+3  A: 

To replace the "special chars", you need to use iconv: $str = iconv('UTF-8', 'ASCII//TRANSLIT', $str);

As for getting the values in between the quotes, see the other answers. Use preg_replace_callback to execute the conversion above on the matches.

EDIT: spoon-feeding everything together:

<?php
$input = 'ööbik';
$expected = 'ööbik';

// Set the locale of your input here.
setlocale(LC_ALL, 'en_US');

// Convert using a callback.
$output = preg_replace_callback('/href="([^"]+)"/', function ($matches) {
    return iconv('UTF-8', 'ASCII//TRANSLIT', $matches[0]);
}, $input);

echo "Input:    $input\n";
echo "Expected: $expected\n";
echo "Output:   $output\n";

This example assumes PHP 5.3. Use "create_function" or a named function if you are stuck on PHP 5.2 or below.

janmoesen
BTW, I wholeheartedly agree with not using regular expressions for parsing HTML. For instance, this code does not work for single-quoted href='' attributes. Use DOMDocument::loadHTML, for instance.
janmoesen
I just love it when new users come here to get a quick answer and then vamoose! Also, I love the word "vamoose".
janmoesen
A: 

While this question may help you finding quoted text: http://stackoverflow.com/questions/2148587/regex-quoted-string-with-escaped-quotes-in-c/2150049

I think the better solution is to do this by parsing an html string and work with its DOM.

Kamarey
I do agree that using regexp on HTML is generally a bad idea, but when you only need to fetch a very specific string from a HTML document, like a single attribute, a regexp is fine.
Atli
Agree, it depends on a specific case.
Kamarey
A: 

I'd asked first, wha do you need such conversion?
And wouldn't urlencode() fit better?

Col. Shrapnel
because they are links for the articles.the whole reason i need this, is because the data for the system was converted from a cd version (xml data files), and since it was done in flash, it magically worked, for some reason.
jevgeni