I've been investigating this issue that only seems to get worse the more I dig deeper.
I started innocently enough trying to use this expression to split a string on HTML 'br' tags:
T = captions.innerHTML.split(/<br.*?>/g);
This works in every browser (FF, Safari, Chrome), except IE7 and IE8 with example input text like this:
is invariably subjective. <br />
The less frequently used warnings (Probably/Possibly) <br />
Please note that the example text contains a space before the '/', and precedes a new line.
Both of the following will match all HTML tags in every browser:
T = captions.innerHTML.split(/<.*?>/g);
T = captions.innerHTML.split(/<.+?>/g);
However, surprisingly (to me at least), this does not work in FF and Chrome:
T = captions.innerHTML.split(/<br.+?>/g);
Edit:
This (suggested several times in the responses below,) does not work on IE 7 or 8:
T = captions.innerHTML.split(/<br[^>]*>/g);
(It did work on Chrome and FF.)
My question is: does anyone know an expression that works in all current browsers to match the 'br' tags above (but not other HTML tags). And can anyone confirm that the last example above should be a valid match since two characters are present in the example text before the '>'.
PS - my doctype is HTML transitional.
Edit:
I think I have evidence this is specific to the string.split() behavior on IE, and not regex in general. You have to use split() to see this issue. I have also found a test matrix that shows a failure rate of about 30% for split() test cases when I ran it on IE. The same tests passed 100% on FF and Chrome:
http://stevenlevithan.com/demo/split.cfm
So far, I have still not found a solution for IE, and the library provided by the author of that test matrix did not fix this case.