views:

53

answers:

2

Dear Santa,

I hope you're reading this!

I have to insert some data into a MySQL table. The data will be retrieved and then (at present) unserialized at which point the correct display language will be selected...

I've managed to munge the data (text encoded with markdown) into a set of PHP statements, roughly along the following lines:

<?php
$component_data = array();
$component_data[65] =
  array( // reformatted to avoid side-scrolling
      "en"=>"* Student welfare is our top priority.\n* We 
               have 30 years of experience of running successful 
               courses for Young Learners.",
      "es"=>"* El bienestar de nuestros estudiantes es nuestra 
               principal prioridad.\n* Contamos con experiencia de 
               30 años de exitosa realización de cursos para jóvenes.",
      "de"=>"* Das Wohl des Lernenden ist unsere oberste Priorität.\n 
               *Wir organisieren seit 30 Jahren erfolgreich 
               Sprachkurse für Jugendliche",
      "it"=>"* Il benessere degli studenti è la nostra priorità 
               assoluta.\n* Abbiamo 30 anni di esperienza nei corsi 
               per ragazzi.",
      "fr"=>"* Le bien-être de l’élève a pour nous la priorité absolue.
             \n* Nous avons 30 ans d'expérience dans la gestion de cours 
             réussis pour jeunes étudiants");
?>

and I was hoping to use the following to get it into a format ready for import into the MySQL table:

<?php
    foreach ($component_data as $id => $value) {
      echo "UPDATE `components` SET `component_content`='".
        mysql_real_escape_string(serialize($value)).
        "' WHERE `id` = '$id';\n";
    }
?>

Unfortunately it does go in, but the result on the page is mangled, i.e. it just shows the serialised string, rather than the array (which is the default behaviour if it can't managed to unserialise the string fetched from MySQL).

I've tried a number of permutations of the PHP string cleaning functions, and my head is frankly spinning.

Ideally, I'd like to be able to reformat the PHP styled data for insertion into the MySQL db, so that when fetched it's still in an unserializable state...

... and for bonus points, if you can convert the utf8 foreign language chars to HTML entitities and from markdown into HTML :-)))

I'm sure someone with a clear head (go easy on the mince-pies) will be able spot it straight away.

Here's hoping :-)

A: 

Have you tried removing the 'mysql_real_escape_string' to see if the unserialize works?

Another thing you could try is base64 encoding on the serialised array.

<?php
    foreach ($component_data as $id => $value) {
      echo "UPDATE `components` SET `component_content`='".
        base64_encode(serialize($value)).
        "' WHERE `id` = '$id';\n";
    }
?>

And then base64_decode it and unserialise when you retrieve it.

xil3
I just updated my answer.
xil3
you should never remove `mysql_real_escape_string` from the query. if you just want to check if some function works as expected - `var_dump()` it.
zerkms
@zerkms: Why the down vote? It was just a debugging suggestion - this is pathetic...
xil3
@Dycey: Please take a look at my second suggestion (answer) - if you base64_encode it, it should fix your problem. I've used the same on some of my previous projects.
xil3
@zerkms you should always remove `mysql_real_escape_string` and switch to something non-shitty enough to have placeholders ;)
hobbs
Makes me wonder sometimes why I should bother even helping anyone when you get people like 'zerkms' criticising any efforts with their unhelpful opinions. @zerkms: If you have a better suggestion, please reply with an answer, so that I can criticise just as you did mine. My answer was purely meant for debugging - I never implied that they should permanently remove 'mysql_real_escape_string'. If you want my true opinion? You should use Zend_Db, which will do all the escaping for you, but I wanted to keep this simple and just help them debug the problem at hand.
xil3
"My answer was purely meant for debugging" // again: if you just want to check if some function works as expected - `var_dump()` it. is it clear enough?
zerkms
@hobbs: yeah, it is another solution, indeed ;-)
zerkms
Yes, I use var_dump() a lot, and it's very useful in most circumstances, but in this circumstance it's easier just to remove 'mysql_real_escape_string' for a quick test to see if that's the problem. It may also be a problem with how the DB saves the data - sometimes it gets jumbled up when you serialise and save to a database. Anyways - either way, the best solution is to base64_encode the serialised data.
xil3
@xil3: Works, but wastes space. I wouldn't call it the best solution.
Amadan
A: 

Thanks to everyone for their useful suggestions.

Actually, the problem turned out to be the mixture of stored textile (not markdown!) and multi-lingual utf-8. The solution to squeezing it into MySQL was a bit crufty. First run textile over the dataset to get it into html, and then munge it through the following, to handle encoding the foreign characters:

<?php
include 'data.php'; // contains component data similar to above.

foreach ($component_data as $id => $value) {
  foreach ($value as $language => $translation) {
    $value[$language] = str_replace(
      array("&lt;","&gt;"),
      array('<','>'), 
      htmlentities($translation, ENT_NOQUOTES, "UTF-8")
      );
  }
  echo "UPDATE `components` SET `component_content`='".mysql_real_escape_string(serialize($value))."' WHERE `id` = '$id';\n";
}

?>

The important bit was the ENT_NOQUOTES which meant a simple str_replace could deal with open and closing tags (no maths in the text thankfully), and the mysql_real_escape_string could handle the single quotes. Glad that's over.

Dycey