tags:

views:

37

answers:

1

Hi all, I am trying to figure out a way to read a CSV with returns in it in PHP. The problem is when you read the file like this:

if (($handle = fopen($file, "r")) !== FALSE) {
 while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        row data...
    }
}

If you have a retun in the CSV it does not work, it just sees the returns as a new row.

My idea was to have a regex to check for unclosed quotes, but I dont know of anything like that. Any ideas?

+2  A: 

Since it seems the built-in fgetcsv does not correctly handle the CSV standard, there are suggestions for alternatives on the PHP man page for fgetcsv - here's one of them:

From http://www.php.net/manual/en/function.fgetcsv.php

The PHP's CSV handling stuff is non-standard and contradicts with RFC4180, thus fgetcsv() cannot properly deal with files like this example ...

There is a quick and dirty RFC-compliant realization of CSV creation and parsing:

<?php 
function array_to_csvstring($items, $CSV_SEPARATOR = ';', $CSV_ENCLOSURE = '"', $CSV_LINEBREAK = "\n") { 
  $string = ''; 
  $o = array(); 

  foreach ($items as $item) { 
    if (stripos($item, $CSV_ENCLOSURE) !== false) { 
      $item = str_replace($CSV_ENCLOSURE, $CSV_ENCLOSURE . $CSV_ENCLOSURE, $item); 
    } 

    if ((stripos($item, $CSV_SEPARATOR) !== false) 
     || (stripos($item, $CSV_ENCLOSURE) !== false) 
     || (stripos($item, $CSV_LINEBREAK !== false))) { 
      $item = $CSV_ENCLOSURE . $item . $CSV_ENCLOSURE; 
    } 

    $o[] = $item; 
  } 

  $string = implode($CSV_SEPARATOR, $o) . $CSV_LINEBREAK; 

  return $string; 
} 

function csvstring_to_array(&$string, $CSV_SEPARATOR = ';', $CSV_ENCLOSURE = '"', $CSV_LINEBREAK = "\n") { 
  $o = array(); 

  $cnt = strlen($string); 
  $esc = false; 
  $escesc = false; 
  $num = 0; 
  $i = 0; 
  while ($i < $cnt) { 
    $s = $string[$i]; 

    if ($s == $CSV_LINEBREAK) { 
      if ($esc) { 
        $o[$num] .= $s; 
      } else { 
        $i++; 
        break; 
      } 
    } elseif ($s == $CSV_SEPARATOR) { 
      if ($esc) { 
        $o[$num] .= $s; 
      } else { 
        $num++; 
        $esc = false; 
        $escesc = false; 
      } 
    } elseif ($s == $CSV_ENCLOSURE) { 
      if ($escesc) { 
        $o[$num] .= $CSV_ENCLOSURE; 
        $escesc = false; 
      } 

      if ($esc) { 
        $esc = false; 
        $escesc = true; 
      } else { 
        $esc = true; 
        $escesc = false; 
      } 
    } else { 
      if ($escesc) { 
        $o[$num] .= $CSV_ENCLOSURE; 
        $escesc = false; 
      } 

      $o[$num] .= $s; 
    } 

    $i++; 
  } 

//  $string = substr($string, $i); 

  return $o; 
} 
?> 
Peter Boughton
The question is will a server be able to handle 50,000 rows? The server I am making this for will need to be able to handle that.Thanks for the quick response!
Nitroware
Well, that will depend on the server specs and what else it might be doing and so on. The code itself seems simple enough that it shouldn't have any issues - but I'm certainly not even close to a PHP performance expert, so don't take that as a guarantee.
Peter Boughton
@Nitroware 50k rows is not a problem. You'll need to set high enough memory_limit. The CSV will consume many times more memory when serialized to array than what it did as a flat file. Start with 64M.
jmz