views:

311

answers:

2

Hi, I have two files which I loaded into lists. The content of the first file is something like this:

d.complex.1
23
34
56
58
68
76
.
.
.
etc
d.complex.179
43
34
59
69
76
.
.
.
etc

The content of the second file is also the same but with different numerical values. Please consider from one d.complex.* to another d.complex.* as one set.

Now I am interested in comparing each numerical value from one set of first file with each numerical value of the sets in the second file. I would like to record the number of times each numerical has appeared in the second file overall.

For example, the number 23 from d.complex.1 could have appeared 5 times in file 2 under different sets. All I want to do is record the number of occurrences of number 23 in file 2 including all sets of file 2.

My initial approach was to load them into a list and compare but I am not able to achieve this. I searched in google and came across sets but being a python noob, I need some guidance. Can anyone help me?

If you feel the question is not clear,please let me know. I have also pasted the complete file 1 and file 2 here:

http://pastebin.com/mwAWEcTa http://pastebin.com/DuXDDRYT

+1  A: 

First create a function which can load a given file, as you may want to maintain individual sets and also want to count occurrence of each number, best would be to have a dict for whole file where keys are set names e.g. complex.1 etc, for each such set keep another dict for numbers in set, below code explains it better

def file_loader(f):
    file_dict = {}
    current_set = None
    for line in f:
        if line.startswith('d.complex'):
            file_dict[line] = current_set = {}
            continue

        if current_set is not None:
            current_set[line] = current_set.get(line, 0)

    return file_dict

Now you can easily write a function which will count a number in given file_dict

def count_number(file_dict, num):
    count = 0
    for set_name, number_set in file_dict.iteritems():
        count += number_set.get(num, 0)

    return count

e.g here is a usage example

s = """d.complex.1
10
11
12
10
11
12"""

file_dict = file_loader(s.split("\n"))
print file_dict
print count_number(file_dict, '10')

output is:

{'d.complex.1': {'11': 2, '10': 2, '12': 2}}
2

You may have to improve file loader, e.g. skip empty lines, convert to int etc

Anurag Uniyal
Well if it was one file, my task would have been easy but I have to compare two lists :(
forextremejunk
I do not understand, why can't you load both files, get the dict out of them and do whatever you want to do with those dict, compare, count integers, intersect sets etc etc
Anurag Uniyal
+2  A: 

Open the file using Python's open function, then iterate over all its lines. Check whether the line contains a number, if so, increase its count in a defaultdict instance as described here.

Repeat this for the other file and compare the resulting dicts.

jellybean