ansaurus

Question

Regex for getting content between $ chars from a text

Answer 1

+1 A:

The regex below captures everything between the $ characters non-greedily

\$(.*?)\$

ennuikiller 2009-12-30 23:46:56

Answer 2

+3 A:

You can use re.findall:

>>> re.findall(r'\$(.*?)\$', s)
['es membres', 'separat existentie es un']

Mark Byers 2009-12-30 23:47:05

why the downvote?

Michael Krelin - hacker 2009-12-30 23:48:45

@Michael, some might think an answer like this deserves a link to the docs (I do), but it's succinct and correct so it certainly doesn't deserve a downvote for the lack. I'll counteract it with an upvote.

Peter Hansen 2010-01-01 15:55:37

Answer 3

A:

import re;
m = re.findall('\$([^$]*)\$','Li Europan lingues $es membres$ del sam familie. Lor $separat existentie es un$ myth');

Michael Krelin - hacker 2009-12-30 23:47:12

You don’t need to escape the `$` inside a character class.

Gumbo 2009-12-30 23:57:15

Although the OP didn't say his input could include empty pairs of dollar signs (no characters between), the use of "+" instead of "*" means this would get out of sync if that did occur. More importantly, without a group (using parantheses), the output includes the dollar signs.

Peter Hansen 2010-01-01 15:58:52

True. Both of you are right. edited.

Michael Krelin - hacker 2010-01-01 21:23:51

Answer 4

A:

Valid regex demo in Perl:

my $a = 'Li Europan lingues $es membres$ del sam familie. Lor $separat existentie es un$ myth.';
my @res;
while ($a =~ /\$([^\$]+)\$/gos)
{
 push(@res, $1);
}

foreach my $item (@res)
{
 print "item: $item\n";
}

flags: s - treat all input text as single line, g - global

UncleMiF 2009-12-30 23:47:18

The question was tagged "Python" and included an explicit request for a Python snippet in the answer.

Peter Hansen 2009-12-31 00:28:00

Well, it was a "would-be-great" type of request. I don't think the lack of python snippet justifies downvote. Naturally, I wouldn't upvote it either.

Michael Krelin - hacker 2010-01-01 22:50:41

Answer 5

A:

Alternative without regexes which works for this simple case:

>>> s="Li Europan lingues $es membres$ del sam familie. Lor $separat existentie es un$"
>>> s.split("$")[1::2]
['es membres', 'separat existentie es un']

Just split the string on '$' (this gives you a python list) and then only use every 'second' element of this list.

ChristopheD 2009-12-30 23:47:31

-1 It DOESN'T work. Did you compare your answer with what the OP expected? Hint: try it again with [1::2] instead of [::2]

John Machin 2009-12-31 01:52:15

True (must have typed/answered too fast). Edited accordingly.

ChristopheD 2009-12-31 07:09:02

Answer 6

+2 A:

Import the re module, and use findall():

>>> import re
>>> p = re.compile('\$(.*?)\$')
>>> s = "apple $banana$ coconut $delicious ethereal$ funkytown"
>>> p.findall(s)
['banana', 'delicious ethereal']

The pattern p represents a dollar sign (\$), then a non-greedy match group ((...?)) which matches characters (.) of which there must be zero or more (*), followed by another dollar sign (\$).

John Feminella 2009-12-30 23:50:47

ansaurus

tags:

views:

answers:

Regex for getting content between $ chars from a text

related questions