tags:

views:

203

answers:

5

Long before I knew anything - not that I know much even now - I desgined a web app in php which inserted data in my mysql database after running the values through htmlentities(). I eventually came to my senses and removed this step and stuck it in the output rather than input and went on my merry way.

However I've since had to revisit some of this old data and unfortunately I have an issue, when it's displayed on the screen I'm getting values displayed which are effectively htmlentitied twice.

So, is there a mysql or phpmyadmin way of changing all the older, affected rows back into their relevant characters or will I have to write a script to read each row, decode and update all 17 million rows in 12 tables?

EDIT:

Thanks for the help everyone, I wrote my own answer down below with some code in, it's not pretty but it worked on the test data earlier so barring someone pointing out a glaring error in my code while I'm in bed I'll be running it on a backup DB tomorrow and then on the live one if that works out alright.

+1  A: 

Since PHP was the method of encoding, you'll want to use it to decode. You can use html_entity_decode to convert them back to their original characters. Gotta loop!

Just be careful not to decode rows that don't need it. Not sure how you'll determine that.

webbiedave
Yeah, I'm aware of the use of the function and if I have to update each row I'll be using it, but I wanted to know if there was a shorter way of doing it in mysql or phpmyadmin, ie a mass update on the affected rows. Some obscure function they hid away from me.
TooManyCooks
@webbie as for your edited point, yeah I'm lucky, I have the old backups of the source I wrote and the log files so I know precisely when the code was altered, and noseying around at the rows in the DB around that time confirm it too.
TooManyCooks
Whew. Good thing you did that!
webbiedave
+1  A: 

I think writing a php script is good thing to do in this situation. You can use, as Dave said, the html_entity_decode() function to convert your texts back.

Try your script on a table with few entries first. This will make you save a lot of testing time. Of course, remember to backup your table(s) before running the php script.

I'm afraid there is no shorter possibility. The computation for millions of rows remains quite expensive, no matter how you convert the datasets back. So go for a php script... it's the easiest way

Simon
Yeah it's what I suspected, however I'd been hoping it's the sort of helpful function phpmyadmin might have hidden away somewhere, rather than having to do it myself. At least if I write it I can share it I guess.
TooManyCooks
I see... What I wanted to say is that even if phpMyAdmin had such a function (it may have), it would just execute a mysql query via php. You wouldn't save anything in terms of execution time and/or resources. But I think writing a script shouldn't be that hard in this case and you will have fine datasets :)
Simon
A: 

I recommend you to Don't make changes on your database but in your php code;

You can try this code, according to your database encoding, :

//UTF-8 for this exemple
$var = stripslashes(htmlentities($row["var"], ENT_QUOTES,'UTF-8')); 
Amirouche Douda
Thanks for the input but I'm not sure entirely what you mean? I'm already using htmlentities, the problem is that some of the rows have data which was stored after I used htmlentities and some of the rows have unencoded data which is run through htmlentities when I select and display it. I need to standardise this rather than have both situations so some changes to the data in the DB are inevitable.
TooManyCooks
You're adding *more* entity-encoding, and for some bizarre reason hacking at slashes? Why? This is the opposite of what's wanted (and will mangle some content).
bobince
This line is just for a correct display on the page, apparently you want to do a reverse engineering to your database, I think that the html_entity_decode is the solution, you can see to the User Contributed Notes of this thread http://php.net/manual/en/function.htmlentities.php or http://php.net/manual/en/function.html-entity-decode.php
Amirouche Douda
A: 

It's a bit kludgy but I think the mass update is the only way to go...

$Query = "SELECT row_id, html_entitied_column FROM table";
$result = mysql_query($Query, $connection);
while($row = mysql_fetch_array($result)){
    $updatedValue = html_entity_decode($row['html_entitied_column']);
    $Query = "UPDATE table SET html_entitied_column = '" . $updatedValue . "' ";
    $Query .= "WHERE row_id = " . $row['row_id'];
    mysql_query($Query, $connection);
}

This is simplified, no error handling etc. Not sure what the processing time would be on millions of rows so you might need to break it up into chunks to avoid script timeouts.

plowe
A: 

I ended up using this, not pretty, but I'm tired, it's 2am and it did its job! (Edit: on test data)

$tables = array('users', 'users_more', 'users_extra', 'forum_posts', 'posts_edits', 'forum_threads', 'orders', 'product_comments', 'products', 'favourites', 'blocked', 'notes');
foreach($tables as $table)
    {       
        $sql = "SELECT * FROM {$table} WHERE data_date_ts < '{$encode_cutoff}'";
        $rows = $database->query($sql);
        while($row = mysql_fetch_assoc($rows))
            {
                $new = array();
                foreach($row as $key => $data)
                    {
                        $new[$key] = $database->escape_value(html_entity_decode($data, ENT_QUOTES, 'UTF-8'));
                    }
                array_shift($new);
                $new_string = "";
                $i = 0;
                foreach($new as $new_key => $new_data)
                    {
                        if($i > 0) { $new_string.= ", "; }
                        $new_string.= $new_key . "='" . $new_data . "'";
                        $i++;
                    }
                $sql = "UPDATE {$table} SET " . $new_string . " WHERE id='" . $row['id'] . "'";
                $database->query($sql);
                // plus some code to check that all out
            }
    }
TooManyCooks