Suppose you have some this String (one line)
10.254.254.28 - - [06/Aug/2007:00:12:20 -0700] "GET /keyser/22300/ HTTP/1.0" 302 528 "-" "Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4"
and you want to extract the part between the GET and HTTP (i.e., some url) but only if it contains the word 'puzzle'. How would you do that using regular expressions in Python?
Here's my solution so far.
match = re.search(r'GET (.*puzzle.*) HTTP', my_string)
It works but I have something in mind that I have to change the first/second/both .*
to .*?
in order for them to be non-greedy. Does it actually matter in this case?