ansaurus

Question

Answer 1

A:

Use:

^<tr><td>\d</td><td>(.*?)</td>

(insert obligatory comment about not using regex to parse xml)

Senseful 2010-09-01 18:25:13

Answer 2

+2 A:

You have to make the .* lazy instead of greedy. Read more about lazy vs greedy here.
Your end of string anchors ($) also don't make sense. Try:

<tr><td>\d<\/td><td>(.*?)<\/td>

(As seen on rubular.)

NOTE: I don't advocate using regex to parse HTML. But some times the task at hand is simple enough to be handled by regex, for which a full-blown XML parser is overkill (for example: this question). Knowing to pick the "right tool for the job" is an important skill in programming.

NullUserException 2010-09-01 18:26:03

Explain the downvote.

NullUserException 2010-09-01 18:31:12

I'm just going to say it wasn't me (even though I did downvote another post for saying HTML isn't regular and should not be parsed with regex). You're actually answering the question. (EDIT: +1 for you)

Platinum Azure 2010-09-01 18:34:01

+1 Good answer and thanks for catching my mistake.

Senseful 2010-09-01 20:12:15

Answer 3

A:

Your leading $ should be a ^.

If you don't want to match all of the way to the end of the string, don't use a $ at the end. However, since * is greedy, it'll grab as much as it can. Some regex implementations have a non-greedy version which would work, but you probably just want to change (.*) to ([^<]*).

dash-tom-bang 2010-09-01 18:26:59

Indeed, I'm curious what was wrong enough about this answer to demand a downvote. Alas.

dash-tom-bang 2010-09-02 00:26:40

ansaurus

tags:

views:

answers:

How to get this regex working?

related questions