tags:

views:

116

answers:

4

I have a php variable that comes from a form that needs tidying up. I hope you can help.

The variable contains a list of items (possibly two or three word items with a space in between words). I want to convert it to a comma separated list with no superfluous white space. I want the divisions to fall only at commas, semi-colons or new-lines. Blank cannot be an item.

Here's a comprehensive example (with a deliberately messy input):

Variable In: "dog, cat         ,car,tea pot,,  ,,, ;;(++NEW LINE++)fly,     cake"
Variable Out "dog,cat,car,tea pot,fly,cake"

Can anyone help?

+10  A: 

You can start by splitting the string into "useful" parts, with preg_split, and, then, implode those parts back together :

$str_in = "dog, cat         ,car,tea pot,,  ,,, ;;
fly,     cake";

$parts = preg_split('/[,;\s]/', $str_in, -1, PREG_SPLIT_NO_EMPTY);

$str_out = implode(',', $parts);

var_dump($parts, $str_out);

(Here, the regex will split on ',', ';', and '\s', which means any whitespace character -- and we only keep non-empty parts)

Will get you, for $parts :

array
  0 => string 'dog' (length=3)
  1 => string 'cat' (length=3)
  2 => string 'car' (length=3)
  3 => string 'tea' (length=3)
  4 => string 'pot' (length=3)
  5 => string 'fly' (length=3)
  6 => string 'cake' (length=4)

And, for $str_out :

string 'dog,cat,car,tea,pot,fly,cake' (length=28)



Edit after the comment : sorry, I didn't notice that one ^^

In that case, you can't split by white-space :-( I would probably split by ',' or ';', iterate over the parts, using trim to remove white-characters at the beginning and end of each item, and only keep those that are not empty :

$useful_parts = array();
$parts = preg_split('/[,;]/', $str_in, -1, PREG_SPLIT_NO_EMPTY);
foreach ($parts as $part) {
    $part = trim($part);
    if (!empty($part)) {
        $useful_parts[] = $part;
    }
}
var_dump($useful_parts);


Executing this portion of code gets me :

array
  0 => string 'dog' (length=3)
  1 => string 'cat' (length=3)
  2 => string 'car' (length=3)
  3 => string 'tea pot' (length=7)
  4 => string 'fly' (length=3)
  5 => string 'cake' (length=4)


And imploding all together, I get, this time :

string 'dog,cat,car,tea pot,fly,cake' (length=28)

Which is better ;-)

Pascal MARTIN
That's a great answer. How would you keep "tea pot" as one item on the final list?
Patrick Beardmore
@Patrick : I edited my answer with a second idea -- didn't quite notice that case ^^
Pascal MARTIN
This works perfectly. Than-you very much.
Patrick Beardmore
You're welcome :-) Have fun !
Pascal MARTIN
+1  A: 

You could use explode and trim and str_replace to get the array, manually remove specific characters, and then turn it back into an array.

function getCleanerStringFromString($stringIn) {
 ///turn the string into an array with a comma as the delimiter
 $myarray = explode(",",$stringin);

 for ($ii =0; $ii < count($myarray); $ii++) {
  ///remove new lines, semi colons, etc
  ///use this line as many times as you'd like to take out characters
  $myarray($ii) = str_replace(";","",$myarray($ii);


  ////remove white spaces
  $myarray($ii) = trim($myarray($ii));

 }

 //then turn it back into an array:
 $backstring = implode(","$myarray);

 return $backstring;
}
mjdth
+1  A: 

Explode entire string on the comma, walk through that array, first matching all characters that are not a-zA-Z0-9 (and space), then trimming remaining leading/trailing spaces. If empty, unset the item from the array. Implode back to a string.

Ideally, this allows for more messy characters than just ,;\s\n etc.

$strIn = "dog, cat         ,car,tea pot,,  ,,, ;;(++NEW LINE++)fly,     cake";
$firstArray = explode(",", $strIn);

$searchPattern = "/[^A-Za-z0-9 ]+/";

function removeViolators($item, $key) {
    preg_replace($searchPattern, "", $item);
    trim($item);
    if (empty($item)) {
        unset($item);
    }
}

array_walk($firstArray, removeViolators);
$strOut = implode(",", $firstArray);
Tegeril
+1  A: 

Split then grep, seems to give the expected output:

$array = preg_split('/\s*[;,\n]\s*/', $string);
$array = preg_grep('/^\s*$/', $array, PREG_GREP_INVERT);
$string = implode(',', $array);

EDIT: actually grep isn't necessary:

$array = preg_split('/\s*[;,\n]\s*/', $string, -1, PREG_SPLIT_NO_EMPTY);
$string = implode(',', $array);
kemp