ansaurus

Question

Answer 1

A:

Using strip_tags unless I'm misunderstanding the question.

    $string = '<option value="abc" >Test - 123</option>
    <option value="def" >Test - 456</option>
    <option value="ghi" >Test - 789</option>';

    $string = strip_tags($string);

Update: Missed that you loosely specify an array in your question. In this case, and I'm sure there's a cleaner method, I'd do something like:

$teststring = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';

$stringarray = split("\n", strip_tags($teststring));
print_r($stringarray);

Update 2: And just to top and tail it, to present it as you originally asked (not an array as we may have been misled to believe, try the following:

$teststring = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';

$stringarray = split("\n", strip_tags($teststring));

$newstring = join($stringarray, "','");
echo "'" . $newstring . "'\n";

Gav 2009-07-12 20:38:37

Answer 2

+1 A:

This code would load the values into an array, assuming you have line breaks in between the option tags like you showed:

// Load your HTML into a string.
$html = <<<EOF
<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>
EOF;

// Break the values into an array.
$vals = explode("\n", strip_tags($html));

James Skidmore 2009-07-12 20:41:46

Answer 3

+3 A:

There are many ways, which one is the best depends on more details than you've provided in your question.
One possibility: DOMDocument and DOMXPath

<?php
$doc = new DOMDocument;
$doc->loadhtml('<html><head><title>???</title></head><body>
  <form method="post" action="?" id="form1">
      <div>
        <select name="foo">
        <option value="abc" >Test - 123</option>
        <option value="def" >Test - 456</option>
        <option value="ghi" >Test - 789</option>
      </select>
    </div>
  </form>
</body></html>');

$xpath = new DOMXPath($doc);
foreach( $xpath->query('//form[@id="form1"]//option') as $o) {
    echo 'option text: ', $o->nodeValue, "  \n";
}

prints

option text: Test - 123  
option text: Test - 456  
option text: Test - 789

VolkerK 2009-07-12 20:42:38

Answer 4

+1 A:

If you’ve not just a fracture like the one mentioned, use a real parser like DOMDocument that you can walk through with DOMXPath.

Otherwise try this regular expression together with preg_match_all:

<option(?:[^>"']+|"[^"]*"|'[^']*')*>([^<]+)</option>

Gumbo 2009-07-12 20:43:31

Answer 5

A:

http://networking.ringofsaturn.com/Web/removetags.php

preg_match_all("s/<[a-zA-Z\/][^>]*>//g", $data, $out);

Bassel Safadi 2009-07-12 20:44:05

This may be a valid pattern for sed but not for php's preg_match_all.

VolkerK 2009-07-12 21:37:48

Answer 6

A:

If we're doing regex stuff, I like this perl-like syntax:

$test = "<option value=\"abc\" >Test - 123</option>\n" .
    "<option value=\"abc\" >Test - 456</option>\n" .
    "<option value=\"abc\" >Test - 789</option>\n"; 

for ($offset=0; preg_match("/<option[^>]*>([^<]+)/",$test, $matches, 
                        PREG_OFFSET_CAPTURE, $offset); $offset=$matches[1][1])
   print($matches[1][0] . "\n");'

Guss 2009-07-12 20:46:51

the value attribute of an option element is defined as CDATA. If I'm not mistaken that allows <option value=">abc " in html 4.01 (validator.w3.org agrees). Your code then prints 'abc" >Test - 123'.

VolkerK 2009-07-12 21:34:53

Yes, it does :-) With regular expressions its easy to write something simple that handles common use cases (and also east to read), but its very hard to write something that parses a structured language like XML correctly. If you need strict "handles anything you throw at it" parser, use something that understands the language like DOM or SAX. The downside is that for simple cases DOM and SAX are harder to write and harder to read.

Guss 2009-07-20 14:50:23

ansaurus

tags:

views:

answers:

How do I strip data from HTML tags

related questions