views:

1522

answers:

6

Hi Guys,

How can I convert all single quotes to double quotes in all HTML tags only? Is there an easier way to do it? Thanks :)

For example: How can I convert this string (actual data from my work):

<TEXTFORMAT LEADING='2'><P ALIGN='LEFT'><FONT FACE='Verdana' style="font-size:10' COLOR='#0B333C'>My name's Mark</FONT></P></TEXTFORMAT>

To this:

<TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Verdana" style="font-size:10" COLOR="#0B333C">My name's Mark</FONT></P></TEXTFORMAT>
+1  A: 

Not really sure exactly what you are trying to accomplish... Replacing pieces of the string using php can be done using the str_replace function:

str_replace("'", "\"", $yourString);
Daan
Building on this, you can use PHPs output buffer to callback the entire body and use str replace on it as it were a string.
Sam152
They want it to only apply inside HTML tags.
garrow
What happends to "My name's Mark" ?
NinethSense
@NinethSense - Ah, THAT's what he means...he wants to replace all occurrences of ' with ", but only if they are inside an HTML tag. Perhaps a smart regex can do the trick, but that's not really my expertise.
Daan
Hi Daan, if I do this str_replace("'", "\"", $yourString); then any single quote outside the HTML tag will also get affected, so "My name's Mark" will become "My name"s Mark"
marknt15
A: 

Use Tidy which can fix your HTML soup and output clean XHTML. It does other nice things too, like fixing nesting problems, lowercasing tags, etcetera, etcetera.

Sander Marechal
+2  A: 

I'm assuming that when you say in all html tags, that you mean all single quotes that contain an attribute. You wouldn't want <a onclick="alert('hi')"> converted b/c it would break the code.

Any regular expression is going to be fragile. If you know your input will be a particular set of simple cases, you might be ok with a regex. Otherwise, you'll want a DOM parser that understands complex html markup like onmouseover="(function () { document.getElementById(''); alert(\"...\")...})()" (for example). Add to that an attribute can span multiple lines. ;)

I haven't had to tackle this particular problem recently, but maybe there's a good way to do it with HTML Tidy (more here: http://devzone.zend.com/article/761) or a parser like this one http://sourceforge.net/projects/simplehtmldom/

Keith Bentrup
@Keith: In my HTML tags, I don't have any Javascript related code like document.getElementById(''); so I am ok with any regular expression as long as it will solve my problem :DThanks, I will check the links you posted.
marknt15
any chance that you'll have CSS? such as style="background: url('/images/bg.gif');"
Keith Bentrup
@Keith: Nope, I will not have a style attribute.
marknt15
wait, i see one in your example ;)
Keith Bentrup
Um, I just converted the 'size="10"' attribute to style="font-size:10px;" but I will not use any single quote inside my HTML tags :)
marknt15
i guess that i would start with two regex's one to find everything inside the html tags like this /<([^>]+)>/g, and then for each of those do something like preg_replace("/='([^']*)'/g", '="$1"') ... as a start, hope that helps ... no guarantees, though and i'm out for the night ;)
Keith Bentrup
+1  A: 

If you don't care about the JavaScript and CSS issues mentioned elsewhere, try this:

$text = "<TEXTFORMAT LEADING='2'><P ALIGN='LEFT'><FONT FACE='Verdana' style='font-size:10' COLOR='#0B333C'>My name's Mark</FONT></P></TEXTFORMAT>";
echo preg_replace('/<([^<>]+)>/e', '"<" . str_replace("\\\\\'", \'"\', "$1") . ">"', $text);

This is taken from a thread by someone with exactly the same problem as you over at devshed.com.

Xiaofu
Hi Xiaofu, I tried it but it did not work? Hhhmmm I'll try again :)
marknt15
My code example is slightly off though, remember that the updated string is the return value from preg_replace. (updated the answer to reflect this)
Xiaofu
@xiaofu: it worked only using this example code:$texts = "<p class='essay_caption'>This is Bob's test</p>"; $zzz = preg_replace('/<([^<>]+)>/e', '"<" . str_replace("\\\\\'", \'"\', "$1") . ">"', $texts); echo htmlspecialchars($zzz);
marknt15
If you've already taken that into account and it still doesn't work, please let me know and I'll delete the answer.
Xiaofu
@xiaofu: Its working now, I tested it again. Thanks a lot :)
marknt15
A: 

I know i could hav'e using regex, but give this a try: assign $string the contents using fpen(), fread() etc...

$string = str_replace("'", '"', $string);
$array = explode('>', $string);
foreach($array as $key => $value){
    if(strpos($value, '<') <> 0 ){
       $array[$key] = str_replace('"', "'",$value);
    }
}
$string = implode('>',$array);
Babiker
A: 

I would go with either a dom parser or roll my own simple tag parser that understands quoting as well as escaping quote characters so that it doesn't take "he said \"blah\"" as he said \, blah\ and empty string.

It could detect whether the quoting to be modified is inside a tag easily. Over many years I have learned that regular expressions are way too fragile for such tasks.

macbirdie