This varies enormously from one regex flavor to the next. For example, JavaScript only lets you use the forward-slash (or solidus) as a delimiter for regex literals, but in Perl you can use just about any punctuation character--including, in more recent versions, non-ASCII characters like «
and »
. When you use characters that come in balanced pairs like braces, parentheses, or the double-arrow quotes above, they have to be properly balanced:
m«\d+»
s{foo}{bar}
Ruby also lets you choose different delimiters if you use the %r
prefix, but I don't know if that extends to the balanced delimiters or non-ASCII characters. Many languages don't support regex literals at all; you just write the regexes as string literals, for example:
r'\d+' // Python
@"\d+" // C#
"\\d+" // Java
Note the double backslash in the Java version. That's necessary because the string gets processed twice: once by the Java compiler and once by the compile()
method of the Pattern class. Most other languages provide a "raw" or "verbatim" form of string literal that all but eliminates such backslash-itis.
And then there's PHP. Its preg
regex functions are built on top of the PCRE library, which closely imitates Perl's regexes, including the wide variety of delimiters. However, PHP itself doesn't support regex literals, so you have to write them as if they were regex literals embedded in string literals, like so:
'/\d+/g' // match modifiers go after the slash but inside the quotes
"{\\d+}" // double-quotes may or may not require double backslashes
Finally, note that even those languages which do support regex literals don't usually offer anything like Perl's s/…/…/
construct. The closest equivalent is a function call that takes a regex literal as the first argument and a string literal as the second, like so:
s = s.replace(/foo/i, 'bar') // JavaScript
s.gsub!(/foo/i, "bar") // Ruby