ansaurus

Question

Problem with parsing data via php and storing it to MySQL database

Answer 1

A:

You could use EXPLODE() and break on space or any other character

Phill Pafford 2009-10-12 12:47:47

Answer 2

A:

Your data fields are on separate lines, so Phil's explode() call would be on the newline character. So the basic datafield acquisition is something like this:

$content = file_get_contents('myfile.txt', true);

foreach(explode("\n", $content) as $line)
{
  $line = trim($line);  // remove leading white space
  // if necessary, check for empty lines here
  switch(substr($line, 0,4)) // examine first four characters
  {
    case '[m1]':
      // regular expression has some escaped characters
      preg_match('/^\[m1](.+)\[\/m]$/', $line, $matches);  
      $field = $matches[1];
      echo "pinyin: '$field'\n";
      break;

    case '[m2]':
      preg_match('/^\[m2](.+)\[\/m]$/', $line, $matches);
      $field = $matches[1];
      echo "translation: '$field'\n";
      break;

    default:
      $field = $line;  // for clarity
      echo "character: '$field'\n";
      break;
  }

}

Here, I have not attempted to identify (a) the start of a new record, or (b) identification of simplified and trad characters. These issues are probably addressed by counting character field identifications -- first one is simplified, second trad, first for a while indicates a new field -- but that's your job.

Nor have I assessed any issues relating to the non-ascii character set. I assume you are on top of that stuff.

I have taken the opportunity to separate the content from presentational markup (like the [b] tags). It's just good practice to keep those semantics separate from the data proper.

Ewan Todd 2009-10-12 13:54:34

Thank you! That's what I needed.

Josh 2009-10-14 19:25:26

ansaurus

tags:

views:

answers:

Problem with parsing data via php and storing it to MySQL database

related questions