views:

85

answers:

6

I am an amateur in JavaScript. I saw this other question, and it made me wonder.

Can you tell me what does the below regular expression exactly mean?

split(/\|(?=\w=>)/)

Does it split the string with "|"?

+4  A: 

It splits the string on | but only if its followed by a char in [a-zA-Z0-9_] and =>

Example:

It will split a|b=> on the |

It will not split a|b on the |

codaddict
+2  A: 

It splits the string on every '|' followed by (?) an alphanumerical character (\w, shorthand for [a-zA-Z0-9_]) + the character sequence '=>'.

Here's a link that can help you understand regular expressions in javascript

KooiInc
+6  A: 

The regular expression is contained in the slashes. It means

\|        # A pipe symbol. It needs to be scaped with a backslash
          # because otherwise it means "OR"
(?=       # a so-called lookahead group. It checks if its contents match 
          # at the current position without actually advancing in the string
   \w=>   # a word character (a-z, A-Z, 0-9, _) followed by =>
)         # end of lookahead group.
Jens
+1  A: 

Breakdown of the regular expression:

  • / regular expression literal start delimiter
  • \| match | in the string, | is a special character in regex, so \ is used to escape it
  • (?= Is a lookahead expression, it checks to see if a string follows the expression without matching it
  • \w=> matches any alphanumeric string (including _), followed by =>
  • )/ marks the end of the lookahead expression and the end of the regex

In short, the string will be split on | if it is followed by any alphanumeric character or underscore and then =>.

Andy E
A: 

In this case, the pipe character is escaped so it's treated as a literal pipe. The split occurs on pipes that are followed by any alphanumeric and '=>'.

The '|' is also used in regular expressions as a sort of OR operator. For example:

split(/k|i|tt|y/)

Would split on either a 'k', an 'i', a 'tt' or a 'y' character.

Quick Joe Smith
A: 

Trimming the delimiting characters, we get \|(?=\w=>)

  • | is a special character in regex, so it should be escaped with a backslash as \|
  • (?=REGEX) is syntax for positive look ahead: matches only if REGEX matches, but doesn't consume the substring that matches REGEX. The match to the REGEX doesn't become part of the matched result. Had it been mere \|\w=>, the parent string would be split around |a=> instead of |.

Thus /\|(?=\w=>)/ matches only those | characters that are followed by \w=>. It matches |a=> but not |a>, || etc.

Consider the example string from the linked question: a=>aa|b=>b||b|c=>cc. If it wasn't for the lookahead, split will yield an array of [a=>aa, b||b, cc]. With lookahead, you'll get [a=>aa, b=>b||b, c=>cc], which is the desired output.

Amarghosh