tags:

views:

138

answers:

5

I'm using an application that gives a timed output based on how many times something is done in a minute, and I wish to manually take the output (copy paste) and have my program, and I wish to count how many times each minute it is done.

An example output is this:

13:48 An event happened.
13:48 Another event happened.
13:49 A new event happened.
13:49 A random event happened.
13:49 An event happened.

So, the program would need to understand that 2 things happened at 13:48, and 3 at 13:49. I'm not sure how the information would be stored, but I need to average them after, to determine an average of how often it happens. Sorry for being so complicated!

+4  A: 

You could just use the time as a key for a dictionary and point it to a list of event messages. The length of that value would give you the number of events, while still letting you get at the specific events themselves:

>>> from pprint import pprint
>>> from collections import defaultdict
>>> events = defaultdict(list)
>>> with open('log.txt') as f:
...     for line in f:
...         time, message = line.strip().split(None, 1)
...         events[time].append(message)
... 
>>> pprint(dict(events)) # pprint handles defaultdicts poorly
{'13:48': ['An event happened.', 'Another event happened.'],
 '13:49': ['A new event happened.',
           'A random event happened.',
           'An event happened.']}

If you want to be extra fancy, you could parse the time into a time object.

Edit: Take into account Mike Graham's suggestions.

Callahad
In modern versions of Python, using `events = collections.defaultdict(list)` instead of `dict`'s `setdefault` method provides a slightly nicer API. Also, it is good form always to use a context manager (`with open('log.txt') as f: for line in f:`) for ensuring that files get closed.
Mike Graham
Depending on the final needs and the amount of data, it may also be sensible to simply store a count (integer) as the value instead of a list. Default to zero and increment the current key on each line.
Ipsquiggle
@Answer~ How would I get it to count the number of objects in a dictionary? And how would I count them dynamically? By that I mean, I won't know the times before hand, so I'll need to be able to count them all anyways. And eventually use the program to add them together.@Mike~ I'm curious to learn more about this context manager you speak of, hehe. I don't understand the statement you use as an example.@Ipsquiggle~ that's interesting, care to explain more?
Mister X
You can use '''len(events['13:48'])''' to count number of events,
Glorphindale
+1  A: 

If you don't need to know what happen but only how many times then:

$ python3.1 -c'from collections import Counter
import fileinput
c = Counter(line.split(None, 1)[0] for line in fileinput.input() if line.strip())
print(c)' events.txt 

Output:

Counter({'13:49': 3, '13:48': 2})
J.F. Sebastian
In Perl: `$ perl -lane'$h{$F[0]}++ if $F[0]; END{$,=" "; print @r while (@r = each %h)}' events.txt`
J.F. Sebastian
Counter is also in Python2.7
gnibbler
+3  A: 

If you just want a count of how many events happen each minute then you don't really need python, you can do it from bash:

 cut -d ' ' -f1 filename | uniq -c

gives

  2 13:48
  3 13:49
Dave Kirby
+1  A: 

You can also use a groupby function from an itertools module with time as a grouping key.

>>> import itertools
>>> from operator import itemgetter
>>> lines = (line.strip().split(None, 1) for line in open('log.txt'))
>>> for key, group in itertools.groupby(lines, key=itemgetter(0)):
...     print '%s - %s' % (key, map(itemgetter(1), group))
... 
13:48 - ['An event happened.', 'Another event happened.']
13:49 - ['A new event happened.', 'A random event happened.', 'An event happened.']
Ruslan Spivak
A: 
awk '{_[$1]++}END{for(i in _) print i,_[i]}' filename
ghostdog74