ansaurus

Question

Regex conditional

Answer 1

A:

^[^<>]*>

if you need the corresponding < as well,

^[^<>]*>[^<]*<

If there is a possibility of tags before the first >,

^[^<>]*(?:<[^<>]+>[^<>]*)*>

Note that it can give false positives, e.g.

<!-- > -->

is a valid HTML, but the RegEx will complain.

KennyTM 2010-08-17 14:48:21

It seems as if this won't catch this a line with: <tag1> badtag2>

Mark Wilkins 2010-08-17 15:22:12

Answer 2

+1 A:

Would this work?

string =~ /^[^<]*>/

This should start at the beginning of the line, look for all characters that aren't an open '<' and then match if it finds a close '>' tag.

spig 2010-08-17 14:51:29

what happens if the > was a closing one from the line above?

Alexander Kjäll 2010-08-17 14:54:12

I think that's a problem with the question. This will do what he asked it to do. To get the previous lines opens up the can-of-worms with using a regular expression to check a non-regular language.

spig 2010-08-17 15:06:25

In perl/ruby and other languages you can use the "m" modifier which will treat the entire string as one line regardless of line breaks. I re-read his question and he doesn't necessarily specify that it would be all on one line. `string =~ /^[^<]*>/m`

spig 2010-08-17 15:32:42

Answer 3

+1 A:

It's a pretty bad idea to try to parse html with regex, or even try to detect broken html with a regex.

What happens when there is a linebreak so that the > character is the first character on the line for example (valid html).

You might get some mileage from reading the answers to this question also: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

Alexander Kjäll 2010-08-17 14:53:14

ansaurus

tags:

views:

answers:

Regex conditional

related questions