ansaurus

Question

Need a regex to add css class to first and last list item

Answer 1

+3 A:

Jamie Zawinski would have something to say about this...

Do you have a proper HTML parser? I don't know if there's anything like hpricot available for PHP, but that's the right way to deal with it. You could at least employ hpricot to do the first cleanup for you.

If you're actually generating the HTML -- do it there. It looks like you want to generate some navigation and have a .first and .last kind of thing on it. Take a step back and try that.

Dustin 2008-11-30 08:17:59

Thanks for the input. Yes, I do have an HTML parser to make use of. Reading from various sources the stance "never parse html with regex", I kinda anticipated being pointed in this direction. I'm just glad everyones suggestions were done tastefully and not flaming me ;-)More info in org. post.

greaterweb 2008-12-01 02:12:05

Answer 2

+1 A:

You wrote:

$patterns = array('/<ul+([^<]*)<li/m','/<([^<]*)(?<=<li)(.*)<\/ul>/s');

First pattern:
ul+ => you search something like ullll...
The m modifier is useless here, since you don't use ^ nor $.

Second pattern:
Using .* along with s is "dangerous", because you might select the whole document up to the last /ul of the page...
And well, I would just drop s modifier and use: (<li\s)(.*?</li>\s*</ul>) with replace: '$1class="last" $2'

In view of above remarks, I would write the first expression: <ul.*?>\s*<li

Although I am tired of seeing the Jamie Zawinski quote each time there is a regex question, Dustin is right in pointing you to a HTML parser (or just generating the right HTML from the start!): regexes and HTML doesn't mix well, because HTML syntax is complex, and unless you act on a well known machine generated output with very predictable result, you are prone to get something breaking in some cases.

PhiLho 2008-11-30 09:16:08

Answer 3

+2 A:

+1 to generating the right html as the best option.

But a completely different approach, which may or may not be acceptable to you: you could use javascript.

This uses jquery to make it easy ...

$(document).ready(
    function() {
        $('#id-of-ul:firstChild').addClass('first');        
        $('#id-of-ul:lastChild').addClass('last');
    }

);

As I say, may or may not be any use in this case, but I think its a valid solution to the problem in some cases.

PS: You say ordered list, then give ul in your example. ol = ordered list, ul = unordered list

benlumley 2008-11-30 09:29:20

Answer 4

A:

You could load the navigation in a SimpleXML object and work with that. This prevents you from breaking your markup with some crazy regex :)

Endlessdeath 2008-11-30 10:51:44

Answer 5

A:

I don't know if anyone cares any longer, but I have a solution that works in my simple test case (and I believe it should work in the general case).

First, let me point out two things: While PhiLho is right in that the s is "dangerous", since dots may match everything up to the final of the document, this may very well be what you want. It only becomes a problem with not well formed pages. Be careful with any such regex on large, manually written pages.

Second, php has a special meaning of backslashes, even in single quotes. Most regexen will perform well either way, but you should always double-escape them, just in case.

Now, here's my code:

<?php
$navigation='<ul>
<li>Coffee</li>
<li>Tea</li>
<li>Milk</li>
<li>Beer</li>
<li>Water</li>
</ul>';

$patterns = array('/<ul.*?>\\s*<li/',
                  '/<li((.(?<!<li))*?<\\/ul>)/s');
$replace = array('$0 class="first"',
                 '<li class="last"$1');
$navigation = preg_replace($patterns, $replace, $navigation);
echo $navigation;
?>

This will output

<ul>
<li class="first">Coffee</li>
<li>Tea</li>
<li>Milk</li>
<li>Beer</li>
<li class="last">Water</li>
</ul>

This assumes no line feeds inside the opening <ul...> tag. If there are any, use the s modifier on the first expression too.

The magic happens in (.(?<!<li))*?. This will match any character (the dot) that is not the beginning of the string <li, repeated any amount of times (the *) in a non-greedy fashion (the ?).

Of course, the whole thing would have to be expanded if there is a chance the list items already have the class attribute set. Also, if there is only one list item, it will match twice, giving it two such attributes. At least for xhtml, this would break validation.

Pianosaurus 2008-12-10 03:18:14

Answer 6

A:

checkout http://www.regexlib.com it has most things you've thought of and 1000s of things you haven't

Frustrating Developments 2008-12-10 03:27:42

ansaurus

tags:

views:

answers:

Need a regex to add css class to first and last list item

related questions