tags:

views:

69

answers:

1

I have a list of variables:

variables = ['VariableA', 'VariableB','VariableC']

which I'm going to search for, line by line

ifile = open("temp.txt",'r')

d = {}

match = zeros(len(variables))
for line in ifile:
    emptyCells=0
    for i in range(len(variables)):
        regex = r'('+variables[i]+r')[:|=|\(](-?\d+(?:\.\d+)?)(?:\))?'
        pattern_variable = re.compile(regex)
        match[i] = re.findall(pattern_variable, line)

        if match[j] == []:
            emptyCells = emptyCells+1

    if emptyCells == 0:
        for k, v in match[j]:
            d.setdefault(k, []).append(v)

The requirement is that I will only keep the lines where all the regex'es matches!

I want to collect all results for each variable in a dictionary where the variable name is the key, and the value becomes a list of all matches.

The code provided is only what I've found out so far, and is not working perfectly yet...

+1  A: 

Can you edit your question to give an example of the source file, so we could test our solutions against it?

Anyway here's a quick hack:

from collections import defaultdict
import re

variables = ['VariableA', 'VariableB', 'VariableC']
regexes = [re.compile(r'(%s)[:|=|\(](-?\d+(?:\.\d+)?)(?:\))?' % (variable,))
           for variable in variables]
d = defaultdict(list)

with open("temp.txt") as f:
    for line in f:
        results = [regex.search(line) for regex in regexes]
        if all(results):
            for m in results:
                k, v = m.groups()
                d[k].append(v)

print d
nosklo
This works perfect!! Very nice solution, thanks! A typical line in the source file: VariableA(2) 00:29:10 VariableB=0.221 VariableC:12.600 sensI=0.000