tags:

views:

83

answers:

3
a="aaaaaa password: GOD hello world password is G0D hello"
match = re.match("^(?:.*(?:password\sis\s|password:\s)([a-zA-Z]*)\s.*)*$",a)

print match.groups()

i want the output to be ('GOD','G0D') but all i get is ('G0D') i am trying to solve this with Regex only. the amount of times "password" can appear in the text can vary. help would be very much appreciated.

A: 

The ([a-zA-Z]*) regular subexpression does not accept digits, you might have meant ([a-zA-Z0-9]+) or another choice would be (\S+).

You have already used \s, are you aware of \S? Because you are using \s as the "delimiter" of your password token, you might as well be consistent and define the password as consisting of any characters which are not delimiters.

You could also simplify your regular expression overall as follows:

^(?:.*:password(\sis|:)\s(\S+)\s.*)*$

As pointed out by codaddict's analogy to PHP's preg_match_all, you also need to call re.findall. To do so, you will need to change the regular expression to one which is not overlapping, such as:

password(\sis|:)\s(\S+)

and then you will receive in the return value from re.findall() a list of matches, each consisting of a list of groups matched.

Heath Hunnicutt
Why are you talking to someone whose avatar picture you don’t like anyway?
Gumbo
How am I going to convince this person who had score 1 to pick a more programming-related avatar? By interacting with them.
Heath Hunnicutt
Heath Hunnicutt,I'm sorry, Stackoverflow must have auto-assigned my gravatar Avatar.I'm sorry (again), perhaps my question was not clear, i will try to explain it again.this is the full python code i have so far (the password is just a example). data=" Hello Mars password: WORLD random words password: HELLO python"match=re.match("REGEX",data)if match<>None: print match.groups()i want match.groups() to print the tuple ('WORLD','HELLO') by just using a Regexi know about \S \s and + however these do not add additional matches. only the last match is placed in the tuple.
Nick Hermans
Lol, I'm sorry, I was being rude anyway. But if stack overflow gave you that Avatar, then I am sorry for them.codaddict makes an additional point that you have to call re.findall() rather than re.match(), which you also need to do. You will also need to modify your regexp to find non-overlapping matches.
Heath Hunnicutt
A: 

I think you'll have to match the first occurrence and then continue matching possible more occurrences using the global matching feature of Pyhon (not sure how to do it, I know very little Python)

In PHP for example we can use a preg_match_all to solve this:

$a="aaaaaa password: GoD hello world password is G0D hello";
if(preg_match_all('/.*?(?:password\sis\s|password:\s)(\w+)/',$a,$matches)) {
    var_dump($matches[1]); // prints God and GOD
}
codaddict
right, but you changed the regexp to use \w as I suggested in my community wiki answer...The problem is his regexp does not accept zero and he is also mis-reading his output to see a zero where the letter after N is printed.
Heath Hunnicutt
You have a point, though -- in python, the equivalent is called re.findall()
Heath Hunnicutt
+1  A: 

I'd use re.findall, and simplify the regex a bit.

>>> re.findall(r"(?:password\sis\s+|password\:\s+)(\S+)", a)
['GOD', 'G0D']

Edit: Changed from \w to \S in order to also capture punctuation, and remove list expression.

Ryan Ginstrom
Thank you very much Ryan, i tried findall several time but it always resulted in returning a single string. however the way you use it results in returning the correct value's.the messy regex was the result of hours of trial and error trying to find the solution of my problem, I'm sorry about that. thanks allot for everyone helping me out.
Nick Hermans
i found that re.findall(r"(?:password\sis\s+|password\:\s+)(\S+)", a) suited me better. in the long run. since i don't actually need the matches of 'password is' and 'password:' and also negates the need for the loop
Nick Hermans
Nice point, I've edited my reply to reflect that.
Ryan Ginstrom