views:

319

answers:

5

I have fooled around with regex but can't seem to get it to work. I have a file called includes/header.php I am converting the file into one big string so that I can pull out a certain portion of the code to paste in the html of my document.

$str = file_get_contents('includes/header.php');

From here I am trying to get return only the string that starts with <ul class="home"> and ends with </ul>

try as I may to figure out an expression I am still confused.

Once I trim down the string I can just print that on my page but I can't figure out the trimming part

A: 

You probably want an XML parser such as the built in one. Here is an example you might want to take a look at. http://www.php.net/manual/en/function.xml-parse.php#90733

If you want to use regex then something along the lines of

$str = file_get_contents('includes/header.php');
$matchedstr = preg_match("<place your pattern here>", $str, $matches);

You probably want the pattern

'/<ul class="home">.*?<\/ul>/s'

Where $matches will contain an array of the matches it found so you can grab whatever element you want from the array with

$matchedstr[0];

which will return the first element. And then output that.

But I'd be a bit wary, regular expressions do tend to match to surprising edge cases and you need to feed them actual data to get reliable results as to when they are failing. However if you are just passing templates it should be ok, just do some tests and see if it all works. If not I'd still recommend using the PHP XML Parser.

Hope that helps.

toofarsideways
".*" is greedy, so it will match more than a single <ul>. Using ".*?" makes it more likely to work, but as you and Peter suggest using a parser is a better approach.
jamessan
Nice catch, have corrected my example :)
toofarsideways
By default, the DOT does not match line breaks, so unless the `<ul class="home"> ... </ul>` resides on the same line, the regex `'/<ul class="home">.*?<\/ul>/'` will not match anything. You can add the s-modifier (aka DOT-ALL modifier): `'/<ul class="home">.*?<\/ul>/s'` so that the DOT will match any character (including line breaks).
Bart Kiers
Thank you, I've tested it so I'll amend my code :)
toofarsideways
A: 

If you need something really hardcore, http://www.php.net/manual/en/book.xmlreader.php.

If you just want to rip out the text that fits that pattern try something like this.

$string = "stuff<ul class=\"home\">alsdkjflaskdvlsakmdf<another></another></ul>stuff";

if( preg_match( '/<ul class="home">(.*)<\/ul>/', $string, $match ) ) { //do stuff with $match[0] }

Kendall Hopkins
This worked but I ended up using a http://simplehtmldom.sourceforge.net/ to parse my file. Much easier when it was all set up. Thanks a ton!
Ben4Himv
A: 

I'm assuming that the difficulty you're having has to do with escaping the regex special characters in the string(s) you're using as a delimiter. If so, try using the preg_quote() function:

$start = preg_quote('<ul class="home">');

$end = preg_quote('</ul>', '/');

preg_match("/" . $start. '.*' . $end . "/", $str, $matching_html_snippets);

The html you want should be in $matching_html_snippets[0]

violoncello
A: 

If you feel like not using regexes you could use string finding, which I think the PHP manual implies is quicker:

function substrstr($orig, $startText, $endText) {
 //get first occurrence of the start string
 $start = strpos($orig, $startText);
 //get last occurrence of the end string
 $end = strrpos($orig, $endText);
 if($start === FALSE || $end === FALSE)
  return $orig;
 $start++;
 $length = $end - $start;
 return substr($orig, $start, $length);
}

$substr = substrstr($string, '<ul class="home">', '</ul>');

You'll need to make some adjustments if you want to include the terminating strings in the output, but that should get you started!

PeterJCLaw
A: 

Here's a novel way to do it; I make no guarantees about this technique's robustness or performance, other than it does work for the example given:

$prefix = '<ul class="home">';
$suffix = '</ul>';
$result = $prefix . array_shift(explode($suffix, array_pop(explode($prefix, $str)))) . $suffix;
ken