tags:

views:

425

answers:

3

My client receives a set of CSV text files periodically, where the elements in each row follow a consistent order and format, but the commas that separate them are inconsistent. Sometimes one comma will separate two elements and other times it will be two or four commas, etc ...

The PHP application I am writing attempts to do the following things:

PSEUDO-CODE: 
1. Upload csv.txt file from client's local directory.
2. Create new HTML table. 
3. Insert the first three fields FROM csv.txt into HTML table row.
4. Iterate STEP 2 while the FIRST field equals the First field below it.
5. If they do not equal, CLOSE HTML table.
6. Check to see if FIRST field is NOT NULL, IF TRUE, GOTO step 2, Else close HTML table.

I have no trouble with steps 1 and 2. Step 3 is where it gets tricky since the fields in the csv.txt files are not always separated by the same number of commas. They are, however, always in the same relative order and format. I am also having issues with step 4. I don't know how to check if the beginning field in a row matches the beginning field in the row below it. Steps 5 should be relatively simple. For step 6, I need to find an eqivalent of a "GOTO" function in PHP.

Please let me know if any part of the question is unclear. I appreciate your help.

Thank you in advance!

+1  A: 

why not simply start by going through any replacing any multiples of commas with a single comma. eg:

abc,def,,ghi,,,,jkl

becomes:

abc,def,ghi,jkl

and then just continue normally.

grahamrb
that or regEx.about the GoTo, I'm sure you don't need a goto.. isn't this in a loop?
Vincent
Exactly... If you have broken data you should fix it before you work with it, not try to work with the broken data.
Greg
A: 

If you mean that there are different numbers of commas on each line, then as far as I can see it is actually impossible to do what you want to do by looking at the commas alone. For example:

ab,c,d,ef // could group columns a-f in that way, but
a,bc,de,f // could also group columns a-f

... and you would have no way of knowing which was the proper arrangement, unless you're given some other instructions or the type of data is identifiable by regular expression as someone else said.

If on the other hand you just mean that sometimes there are blanks, but there are still the same number of columns, like this:

a,b,,d,e,f
a,,c,d,e,f

... then you can still form the table correctly. I would recommend using explode(',' $line) in that case and then doing your processing on the elements of the exploded array without worrying about what is inside them.

Doug Treadwell
+1  A: 

If you want to group the rows by their first element you can try something like:

  • read the next row via fgetcsv()
  • filter empty elements (a,,b,c -> a,b,c)
  • if the row contains fields <-> is not empty append the row to "its" group

That's not exactly what you've described but it may be what you want ;-)

<?php
$fp = fopen('test.csv', 'rb') or die('!fopen');
$groups = array();
while(!feof($fp)) {
  $row = array_filter(fgetcsv($fp));
  if ( !empty($row) ) {
    // @ because I don't care whether the array exists or not
    @$groups[$row[0]][] = $row;
  }
}

foreach( $groups as $g ) {
  echo '
    <table>';
  foreach( $g as $row ) {
    echo '
      <tr>
        <td>', join('</td><td>', array_map('htmlentities', $row)), '</td>
      </tr>
    ';
  }
  echo '</table>';
}
VolkerK