views:

68

answers:

3

Is there a way to back reference in the regular expression pattern?

Example input string:

Here is "some quoted" text.

Say I want to pull out the quoted text, I could create the following expression:

"([^"]+)"

This regular expression would match some quoted.

Say I want it to also support single quotes, I could change the expression to:

["']([^"']+)["']

But what if the input string has a mixture of quotes say Here is 'some quoted" text. I would not want the regex to match. Currently the regex in the second example would still match.

What I would like to be able to do is if the first quote is a double quote then the closing quote must be a double. And if the start quote is single quote then the closing quote must be single.

Can I use a back reference to achieve this?


My other related question: http://stackoverflow.com/questions/2723325/getting-text-between-quotes-using-regular-expression

A: 

/"\(.*?\)".*?\1/ should work, but it depends on the regular expression engine

knittl
+2  A: 

preg_match('/(["\'])([^"\']+)\1/', 'Here is \'quoted text" some quoted text.');

Explanation: (["'])([^"']+)\1/ I placed the first quote in parentheses. Because this is the first grouping, it's back reference number is 1. Then, where the closing quote would be, I placed \1 which means whichever character was matched in group 1.

webbiedave
+2  A: 

You can make use of the regex:

(["'])[^"']+\1
  • () : used for grouping
  • [..] : is the char class. so ["'] matches either " or ' equivalent to "|'
  • [^..] : char class with negation. It matches any char not listed after the ^
  • + : quantifier for one or more
  • \1 : backreferencing the first group which is (["'])

In PHP you'd use this as:

preg_match('#(["\'])[^"\']+\1#',$str)

Working example

codaddict