ansaurus

Question

Regexp - search for text which doesn't contain whole word

Answer 1

+3 A:

You need a non-greedy match:

<html>.*?</p>

Also, you might want to consider using an HTML parser instead of regular expressions for this task.

Mark Byers 2010-02-06 19:31:38

nice to see an regex answer to an HTML question :-) You know, you should really be using an HTML parser for this instead.

Mike Sherov 2010-02-06 19:36:52

Cool, it's working, thanks for help(I can't use html parser in this case, otherwise i do]

lennyd 2010-02-06 19:39:49

@Mike: Yeah, my reputation is ruined now! ;-)

Mark Byers 2010-02-06 19:58:27

Answer 2

+2 A:

Dominik 2010-02-06 19:34:59

Answer 3

A:

To capture the data in between para tags you may use regexp with positive look-ahead assertion /<p>(.*)(?=<\/p>)/, which is more greedy then .*? and works slower, but may be helpful for you. Also make sure that your HTML is valid, that means:

All para tags are closed. HTML browsers close para tags, when they enter another block.
Para tags are not nested :) Otherwise you have problems with any regex.

dma_k 2010-02-06 20:41:06

Answer 4

A:

Silly question, still using pure regex, why not just strip any <..> inside paragraphs? THEN grab the phrases using something like [^<]
?

Luxvero 2010-02-07 01:18:55

ansaurus

tags:

views:

answers:

Regexp - search for text which doesn't contain whole word

related questions