tags:

views:

50

answers:

3

I am trying to parse an output of about a hundred lines. The format of this output is as such:

<random text> STATION "STATION_NAME_ONE": <random text>
<random text> IP Address: 0.0.0.0 <random text>
<SEVERAL LINES OF RANDOM TEXT>

<random text> STATION "STATION_NAME_TWO": <random text>
<random text> IP Address: 1.1.1.1 <random text>
<SEVERAL LINES OF RANDOM TEXT>

... and so on

I know the IP Address of the station I am looking for. Using the IP address, I am trying to construct a regex that will find the station name. The station name can be any length and can contain any number of numbers/letters/underscores. The station name will always be preceded by STATION and will always be followed by a colon. The IP Address will always be on the line following the station name and will always be preceded by IP Address:.

Note there are several stations with different station names and IP Addresses. The 'random text' can be of any length and contain any symbol/number/letter.

So far my attempts have been:

re.search('(?<=STATION ).*?(?=:.*IP Address: %s)' % sta_ip, output, re.DOTALL)

but obviously this will return pretty much the first station name every time.

How would you make a regex that can search for the specified station name? Is this possible?

A: 
STATION\s*"(.*?)":\s*.*?(?:\r|\n)<.*?>\s*IP Address:\s*IPHERE\s*<

Replace IPHERE with the IP address and to get the station name, extract the first matching group.

SimpleCoder
I cannot get this to return a group.:(
help_paul
The regex works, so it must be how you are attempting to get the group. Make sure you are using the right regex function
SimpleCoder
Quick question, what do the quotes do?
help_paul
They are a placeholder so the Regex doesn't capture them as a part of the station name... so you'll get StationName instead of "StationName"
SimpleCoder
A: 

Try /STATION\s*?"(.*?)"\s*?:.*?IP Address:\s*?%s/
The trick is not to be greedy about matching. After matching this regex, the name you want will be in the first capture.

Borealid
Hmmm, I cannot get this to return a group.
help_paul
A: 

Edit I've got it. The key is that the station name and IP are only separated by one newline, so we can hardcode that newline.

re.search('STATION(?P<StationName>.*?):.*?\n.*?IP Address: %s' % sta_ip, output).group("StationName")
Josiah
this unfortunately still returns the first station name no matter the ip.
help_paul
Try my edited code, it should work as advertised.
Josiah
Hmmm yea this worked. Thanks.
help_paul
Sure thing :) Glad to have helped
Josiah