10
5
-1
-1
-1
1
1
0
2
...
If I want to count the number of occurrences of each number in a file, how do I use python to do it?
10
5
-1
-1
-1
1
1
0
2
...
If I want to count the number of occurrences of each number in a file, how do I use python to do it?
Use dictionary where every line is a key, and count is value. Increment count for every line, and if there is no dictionary entry for line initialize it with 1 in except clause -- this should work with older versions of Python.
def count_same_lines(fname):
line_counts = {}
for l in file(fname):
l = l.rstrip()
if l:
try:
line_counts[l] += 1
except KeyError:
line_counts[l] = 1
print('cnt\ttxt')
for k in line_counts.keys():
print('%d\t%s' % (line_counts[k], k))
Read the lines of the file into a list l
, e.g.:
l = [int(line) for line in open('filename','r')]
Starting with a list of values l
, you can create a dictionary d
that gives you for each value in the list the number of occurrences like this:
>>> l = [10,5,-1,-1,-1,1,1,0,2]
>>> d = dict((x,l.count(x)) for x in l)
>>> d[1]
2
EDIT: as Matthew rightly points out, this is hardly optimal. Here is a version using defaultdict:
from collections import defaultdict
d = defaultdict(int)
for line in open('filename','r'):
d[int(line)] += 1
I think what you call map is, in python, a dictionary.
Here is some useful link on how to use it: http://docs.python.org/tutorial/datastructures.html#dictionaries
For a good solution, see the answer from Stephan or Matthew - but take also some time to understand what that code does :-)
l = [10,5,-1,-1,-1,1,1,0,2]
d = {}
for x in l:
d[x] = (d[x] + 1) if (x in d) else 1
There will be a key in d for every distinct value in the original list, and the values of d will be the number of occurrences.
This is almost the exact same algorithm described in Anurag Uniyal's answer, except using the file as an iterator instead of readline()
:
from collections import defaultdict
try:
from io import StringIO # 2.6+, 3.x
except ImportError:
from StringIO import StringIO # 2.5
data = defaultdict(int)
#with open("filename", "r") as f: # if a real file
with StringIO("10\n5\n-1\n-1\n-1\n1\n1\n0\n2") as f:
for line in f:
data[int(line)] += 1
for number, count in data.iteritems():
print number, "was found", count, "times"
Counter is your best friend:)
http://docs.python.org/dev/library/collections.html#counter-objects
for(Python2.5 and 2.6) http://code.activestate.com/recipes/576611/
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
... cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})
for this :
print Counter(int(line.strip()) for line in open("foo.txt", "rb"))
##output
Counter({-1: 3, 1: 2, 0: 1, 2: 1, 5: 1, 10: 1})
#!/usr/bin/env python
import fileinput
from collections import defaultdict
frequencies = defaultdict(int)
for line in fileinput.input():
frequencies[line.strip()] += 1
print frequencies
Example:
$ perl -E'say 1*(rand() < 0.5) for (1..100)' | python counter.py
defaultdict(<type 'int'>, {'1': 52, '0': 48})
New in Python 3.1:
from collections import Counter
with open("filename","r") as lines:
print(Counter(lines))