views:

135

answers:

4

I am using Yahoo Pipes to take a twitter feed and filter information out. Pipes' regex function is a replace ______ with ____ iteration.

My example is:

Testbedots: happy "twins"

I am trying to find a regex string that will select everything but what is within the double quotations. I am assuming there will only be one set of quotations. In the replace side of regex, I have seen people use $1,$2,$3 to replace with something identified as a variable in the first part of the regex function. The idea is to pull the word twins, or whatever is between quotes out of the line and have it replace the whole line.

Any recommendations? I am obviously new at regex's but have been reading the online tutorials for hours without making any headway.

Thank you for your help,

Skyler

A: 

Try this regular expression

(\w+:.*?) "

This will "get a word before ':' character and the biggest character sequence before a space followed by double quotation mark"

Rubens Farias
A: 

Not sure on the syntax of Pipes, but generally with perl-compatible regex syntax I think you could do something like

s/[^"]*"([^"]+)"[^"]*/$1/
Devin Ceartas
A: 

I'd probably write the regular expression as:

/"([^"]*)"/

In other words, start matching at a double quote, match non-double-quote characters until you get to another double quote. The parentheses indicate what you're interested in. If you want at least one character (an empty string doesn't work) put a + instead of a *.

This would put the bit you're interested in into $1 or whatever your particular syntax is for the first captured match.

Anthony Mills
+1  A: 

In Yahoo Pipes you can use this expression to replace the whole line with the quoted text:

^.*"(.*)".*$

and replace it with

$1

For your example, it would replace Testbedots: happy "twins" with twins.

I assume there are always exactly two quotes (") in the text.

Also note, that your question is a bit confusing. You said you want an expression "that will select everything but what is within the double quotations". That sounds like you want the whole line but not the quoted text.

Arend