views:

109

answers:

2

I have data coming in from a machine (via pexpect) and I parse it using regexes into a dictionary like this

for line in stream:
    if '/' in line:
        # some matching etc which results in getting the
        # machine name, an interface and the data for that interface
        key=str(hostname)+":"+r.groups()[0][0:2]+r.groups()[2]
        dict[key]=str(line[3])

And that all works ok, I get lots of lines like this when I read it back

machine1:fe0 <data>  

<data> is one string or integer

I now realise that multiple data can exist for the interface, and it seems that in this case, I am overwriting the value for the key every time I encounter it. What I would like is to make the key unique in a way which highlights the fact that multiple info exists for that interface. E.g. if fe0 has 3 instances or fe1 has 4

machine1:fe0:3 <data> <data> <data>
machine1:fe1:4 <data> <data> <data> <data>

To that end I don't mind if a single instance has a 1 after it to tell me that.
hope this is clear and someone can point me in the right direction - many thanks

A: 
for (lineno, line) in enumerate(stream):
    if '/' in line:
        # some matching etc which results in getting the
        # machine name, an interface and the data for that interface
        key=str(hostname)+":"+r.groups()[0][0:2]+r.groups()[2]
        dict[key + ":" + lineno]=str(line[3])

You won't end up with it smoothly increasing this way, but each dictionary key will be unique, and the numbers associated with each hostname+interface pair will be increasing. You could make the keys lexically sortable by changing the last line to dict[key + ":" + ('%06d' % (lineno,))=str(line[3])

Omnifarious
This works ok, though it doesn't give a count of the instances, but it does give multiple lines which are easy to spot once sorted in excel (has to go to others). I'm trying to get the other working to try that too
+3  A: 

You can create a list for each key, holding all values for that key:

d = collections.defaultdict(list)
for line in stream:
    if '/' in line:
        #.....
        key =  str(hostname)+":"+r.groups()[0][0:2]+r.groups()[2]
        value = str(line[3])
        d[key].append(value)

Edit: If you want the keys/values exactly as specified in your question, you can then do something like:

d2 = {}
for key,values in d.iteritems():
    d2['%s:%d' % (key, len(values)] = ' '.join(str(v) for v in values)

I used ' '.join() here to join the values into a single string - it isn't really clear from your question if that's what you want.

I don't recommend doing things this way, as it will make accessing individual values more difficult.

interjay
I think this is cleaner than what I did, but it changes the type of thing stored in the dictionary which may require other parts of the program to change to accommodate it.
Omnifarious
I added a way you can do exactly what you asked, though I'm not sure how you want the values stored.
interjay
OK just trying to understand how to put this into my script. The data goes into excel later so I've been separating data with tabs
Hi Interjay as an infrequent python user I'm having real difficulty using this code in mine. Can you help?
I would use `d.popitem()` in the second loop because I tend to become overly worried about memory consumption.
Omnifarious
@household3, what do you need help with? To separate the data with tabs, use `'\t'.join` instead of `' '.join`.
interjay
d = collections.defaultdict(list)for key, value in items: d[key].append(value)Apologies but I can't figure how that fits in with my population of the dictionary. Where did list and items come from?
`list` is a built-in type. Instead of `for key,value in items`, use your own for loop, where `key` would be `str(hostname)+":"+r.groups()[0][0:2]+r.groups()[2]` and `value` would be `str(line[3])`.
interjay
Sorry I feel such a plank. What are items???
I was just using that as a general example. I've edited now to make it match your code.
interjay
Apologies for my density, I'm a network guy. Many thanks for you patience and advice, that is exactly what I want