I want to find the second <BR>
tag and to start the search from there. How can i do it using regular expressions?
<BR>like <BR>Abdurrahman<BR><SMALL>Fathers Name</SMALL>
I want to find the second <BR>
tag and to start the search from there. How can i do it using regular expressions?
<BR>like <BR>Abdurrahman<BR><SMALL>Fathers Name</SMALL>
The usual solution to this sort of problem is to use a "capturing group". Most regular expression systems allow you to extract not only the entire matching sequence, but also sub-matches within it. This is done by grouping a part of the expression within (
and )
. For instance, if I use the following expression (this is in JavaScript; I'm not sure what language you want to be working in, but the basic idea works in most languages):
var string = "<BR>like <BR>Abdurrahman<BR><SMALL>Fathers Name</SMALL>";
var match = string.match(/<BR>.*?<BR>([a-zA-Z]*)/);
Then I can get either everything that matched using match[0]
, which is "<BR>like <BR>Abdurrahman"
, or I can get only the part inside the parentheses using match[1]
, which gives me "Abdurrahman"
.
assuming you are using PHP, you can split your string on <BR>
using explode
$str='<BR>like <BR>Abdurrahman<BR><SMALL>Fathers Name</SMALL>';
$s = explode("<BR>",$str,3);
$string = end($s);
print $string;
output
$ php test.php
Abdurrahman<BR><SMALL>Fathers Name</SMALL>
you can then use "$string" variable and do whatever you want.
The steps above can be done with other languages as well by using the string splitting methods your prog language has.
Prepend <BR>[^<]*(?=<BR>)
to your regex, or remove the lookahead part if you want to start after the second <BR>
, such as: <BR>[^<]*<BR>
.
Find text after the second <BR>
but before the third: <BR>[^<]*<BR>([^<]*)<BR>
This finds "waldo" in <BR>404<BR>waldo<BR>
.
Note: I specifically used the above instead of the non-greedy .*?
because once the above starts not working for you, you should stop parsing HTML with regex, and .*?
will hide when that happens. However, the non-greedy quantifier is also not as well-supported, and you can always change to that if you want.
this regular expression should math the first two <br />
s:
/(\s*<br\s*/?>\s*){2}/i
so you should either replace them with nothing or use preg_match
or RegExp.prototype.match
to extract the arguments.
In JavaScript:
var afterReplace = str.replace( /(\s*<br\s*\/?>\s*){2}/i, '' );
In PHP
$afterReplace = preg_replace( '/(\s*<br\s*\/?>\s*){2}/i', '', $str );
I'm only sure it'll work in PHP / JavaScript, but it should work in everything...