ansaurus

Question

Regular Expression: Start from second one

Answer 1

A:

The usual solution to this sort of problem is to use a "capturing group". Most regular expression systems allow you to extract not only the entire matching sequence, but also sub-matches within it. This is done by grouping a part of the expression within ( and ). For instance, if I use the following expression (this is in JavaScript; I'm not sure what language you want to be working in, but the basic idea works in most languages):

var string = "<BR>like <BR>Abdurrahman<BR><SMALL>Fathers Name</SMALL>";
var match = string.match(/<BR>.*?<BR>([a-zA-Z]*)/);

Then I can get either everything that matched using match[0], which is " like Abdurrahman", or I can get only the part inside the parentheses using match[1], which gives me "Abdurrahman".

Brian Campbell 2010-01-08 07:36:13

are you sure this is working properly?

uzay95 2010-01-08 07:45:11

I'm not sure exactly what you are looking for. You might want to clarify your question. This shows you how to find two ` ` tags, followed by whatever else you put in the parentheses. For instance, if you are looking for "Father", the search would be ` .*? .*(Father)`, and the first substring match would refer to where it found `Father`. http://rubular.com/regexes/12836

Brian Campbell 2010-01-08 08:06:30

Answer 2

A:

assuming you are using PHP, you can split your string on   using explode

$str='<BR>like <BR>Abdurrahman<BR><SMALL>Fathers Name</SMALL>';
$s = explode("<BR>",$str,3);
$string = end($s);
print $string;

output

$  php test.php
Abdurrahman<BR><SMALL>Fathers Name</SMALL>

you can then use "$string" variable and do whatever you want.

The steps above can be done with other languages as well by using the string splitting methods your prog language has.

ghostdog74 2010-01-08 08:00:46

Answer 3

+1 A:

Prepend  [^<]*(?= ) to your regex, or remove the lookahead part if you want to start after the second  , such as:  [^<]* .

Find text after the second   but before the third:  [^<]* ([^<]*) 

This finds "waldo" in  404 waldo .

Note: I specifically used the above instead of the non-greedy .*? because once the above starts not working for you, you should stop parsing HTML with regex, and .*? will hide when that happens. However, the non-greedy quantifier is also not as well-supported, and you can always change to that if you want.

Roger Pate 2010-01-08 08:13:13

Note that ` [^<]* ` is not the same as ` .*? `.

Gumbo 2010-01-08 08:17:25

Very good answer. Thank you but i want to ask 1 more question. This is very good >[^<]* generates this result '>like' . But i want to remove '>' tag from the result. So i just want to have 'like' result. How can i do this?

uzay95 2010-01-08 08:18:15

@Gumbo, but they have same result.

uzay95 2010-01-08 08:19:16

uzay95: I don't understand what you mean.

Roger Pate 2010-01-08 08:19:48

uzay95: No, they are different, and I believe you should use what I answered, for the stated reason.

Roger Pate 2010-01-08 08:20:21

@Roger Pate, first i've edited my first comment to express myself better so that i can get "like" word. And could you please tell why they are different?

uzay95 2010-01-08 08:31:58

uzay95: I still don't understand what you mean. Could you give example input. actual behavior, and desired behavior? --- They are different when you try to parse HTML, such as this input: ` abcdef ghijkl `.

Roger Pate 2010-01-08 08:35:47

Look, this my target string: like Abdurrahman Fathers Nameand when I write " >[^<]* " the result is equal to : '>like'As you can see, it includes an undesired character which is " > ". I don't want that. All I am asking is, where am I making a mistake? How can I get my code to just get the word "like" and nothing else.

uzay95 2010-01-08 08:42:09

To get "like " and nothing else from ` like Abdurrahman Fathers Name`, use: ` ([^<]*) `.

Roger Pate 2010-01-08 08:46:20

To get "Abdurrahman" from ` like Abdurrahman Fathers Name`, use: ` [^<]* ([^<]*) `.

Roger Pate 2010-01-08 08:46:58

Roger, I really thank you for your patient comments. I've just tried the code you suggested and it seems to return/highlight/include the and codes as well. So, I was trying to get rid of ">" character but now I have even more to get rid of. So unfortunately it didn't do what I wanted it to do. I apologize for repeating it again and again but isn't there a way to just highlight the word "like" ?

uzay95 2010-01-08 09:00:34

There's multiple levels of matching, your program is showing the complete matched text, while you're interested in the first group here (the part between the parentheses); get your program to show you the difference between those.

Roger Pate 2010-01-08 09:32:05

Answer 4

A:

this regular expression should math the first two  s:

/(\s*<br\s*/?>\s*){2}/i

so you should either replace them with nothing or use preg_match or RegExp.prototype.match to extract the arguments.

In JavaScript:

var afterReplace = str.replace( /(\s*<br\s*\/?>\s*){2}/i, '' );

In PHP

$afterReplace = preg_replace( '/(\s*<br\s*\/?>\s*){2}/i', '', $str );

I'm only sure it'll work in PHP / JavaScript, but it should work in everything...

Dan Beam 2010-01-08 08:17:58

Would you tell me please what is the meaning of this reges '/(\s*<br\s*/?>\s*){2}/i' I just want to learn.

uzay95 2010-01-08 08:21:02

Dan: That won't match given input text of ` anything here `, because you don't allow for anything but `\s` between the tags.

Roger Pate 2010-01-08 08:27:11

to explain /(\s*<br\s*/?>\s*){2}/i/ # start regex( # start group\s # whitespace* # any number of previous (inc. zero) # literal\s # whitespace* # zero or more of the previous) # end group{2} # 2 of the group/ # end regexi # match non-case sensitively(sorry my spacing is lost)

ternaryOperator 2010-01-08 14:55:52

ansaurus

tags:

views:

answers:

Regular Expression: Start from second one

related questions