views:

386

answers:

6

How can I match the three words in the following string with a Perl compatible regular expression?

word1#$word2#$word3

I don't know the actual words "word1, word2 and word3" in advance. I only know the separator, which is #$.

And I can't use the word boundary as I have a multibyte encoding. This means for instance that the string can contain non-ASCII characters like umlauts which are not detected by the \w control character.

+2  A: 

Try this regular expression:

/(\w+)#\$(\w+)#\$(\w+)/


Edit   After your provided us with more information (see the comments to this answer):

/((?:[^#]+|#[^$])*)#\$((?:[^#]+|#[^$])*)#\$((?:[^#]+|#[^$])*)/
Gumbo
Can't use word boundary as the word could be virtually everything that does not consist of the two consecutive characters #$
prinzdezibel
+1  A: 
#!/usr/bin/perl

use strict;
use warnings;

my $x = 'word1#$word2#$word3';
print $_, "\n" for split /#\$/, $x;
Sinan Ünür
I guess he is asking about Regular expression
joe
@Kirsh the right Perl tool to use here is `split`
Sinan Ünür
I need a regular expression here as this serves as subexpression only.
prinzdezibel
@mixedpickles: No, you cannot use split because you are not using Perl which is fine but you should identify your question correctly.
Sinan Ünür
A: 

This will work for any string that has 2 #

/([^#]+)\#\$([^#]+)\#\$([^#]+)/
Brad Gilbert
This does not work as it matches the dollar sign as well.
prinzdezibel
A: 
/([^#]*?)#\$([^#]*?)#\$([^#]*)/
cdm9002
This does not work for wor#d1#$word2#$word3which would be valid as the separator is always a hash character following a dollar character
prinzdezibel
A: 

A split function might be useful although it depends what you want to do with the line.

here is an example though.

my $line = "word1#$word2#$word3"
my @words = split('#$', $line)
gnomed
+1  A: 
$str = explode('#$', $str);

Regex is overkill for this.

eyelidlessness