I am an amateur in JavaScript. I saw this other question, and it made me wonder.
Can you tell me what does the below regular expression exactly mean?
split(/\|(?=\w=>)/)
Does it split the string with "|"?
I am an amateur in JavaScript. I saw this other question, and it made me wonder.
Can you tell me what does the below regular expression exactly mean?
split(/\|(?=\w=>)/)
Does it split the string with "|"?
It splits the string on |
but only if its followed by a char in [a-zA-Z0-9_]
and =>
Example:
It will split a|b=>
on the |
It will not split a|b
on the |
It splits the string on every '|' followed by (?) an alphanumerical character (\w, shorthand for [a-zA-Z0-9_]) + the character sequence '=>'.
Here's a link that can help you understand regular expressions in javascript
The regular expression is contained in the slashes. It means
\| # A pipe symbol. It needs to be scaped with a backslash
# because otherwise it means "OR"
(?= # a so-called lookahead group. It checks if its contents match
# at the current position without actually advancing in the string
\w=> # a word character (a-z, A-Z, 0-9, _) followed by =>
) # end of lookahead group.
Breakdown of the regular expression:
/
regular expression literal start delimiter\|
match |
in the string, |
is a special character in regex, so \
is used to escape it(?=
Is a lookahead expression, it checks to see if a string follows the expression without matching it\w=>
matches any alphanumeric string (including _
), followed by =>
)/
marks the end of the lookahead expression and the end of the regexIn short, the string will be split on |
if it is followed by any alphanumeric character or underscore and then =>
.
In this case, the pipe character is escaped so it's treated as a literal pipe. The split occurs on pipes that are followed by any alphanumeric and '=>'.
The '|' is also used in regular expressions as a sort of OR operator. For example:
split(/k|i|tt|y/)
Would split on either a 'k', an 'i', a 'tt' or a 'y' character.
Trimming the delimiting characters, we get \|(?=\w=>)
|
is a special character in regex, so it should be escaped with a backslash as \|
(?=REGEX)
is syntax for positive look ahead: matches only if REGEX
matches, but doesn't consume the substring that matches REGEX
. The match to the REGEX
doesn't become part of the matched result. Had it been mere \|\w=>
, the parent string would be split around |a=>
instead of |
.Thus /\|(?=\w=>)/
matches only those |
characters that are followed by \w=>
. It matches |a=>
but not |a>
, ||
etc.
Consider the example string from the linked question: a=>aa|b=>b||b|c=>cc
. If it wasn't for the lookahead, split will yield an array of [a=>aa, b||b, cc]
. With lookahead, you'll get [a=>aa, b=>b||b, c=>cc]
, which is the desired output.