tags:

views:

28

answers:

1

I already have some regex logic which says to look for a div tag with class=something. However, this might occur more than once (one after another). You can't simply add square brackets around that complex regex logic already (e.g. [:some complicated regex logic already existing:]* -- so how do you do it in regex? I want to avoid having to use the programming language logic to append that regex logic after itself if I can...

Thanks

+1  A: 

Don't parse HTML with regexen! Seriously, it's literally impossible in the general case.

To answer your regex question: if you have some arbitrarily complex regex R, you can do the following things with it:

  • (R) matches R and stores it in a capturing group.
  • (?:R), if supported by your regex engine, matches R without storing it in a capturing group.

In other words, parentheses group; square brackets, on the other hand, are for character classes only. You probably want something like (with a better regex for your div) (?:<div class="something">\s*)+: match the div followed by any number of spaces, and find that one or more times. But please reconsider using regexen for this—while they're a handy tool for many things, HTML is not one of them.

Antal S-Z