I am just a beginner in python. Recently i am learning to use dictionaries but my knowledge in it is still limited. I have this idea popping out from my head but i am not sure whether it is workable in python.
I have 3 document looks like this:
DOCNO= 5
nanofluids :0.6841
introduction:0.2525
module :0.0000
to :0.0000
learning :0.0000
DOCID= 1
nanofluids :0.0000
introduction:0.2372
module :0.0000
to :0.0000
learning :0.1185
DOCNO= 12
nanofluids :0.0000
introduction:0.0000
module :0.5647
to :0.0000
learning :0.2084
I know how to store a single value in dictionary. For example:
data={5: 0.67884, 1:0.1567, 12:3455}
But what i want to do now is storing an array with corresponding document number which looks like:
import array
data={ 5:array([0.6841,0.2525,0.0000.0000,0.0000]), 1:array([0.0000,0.2372,0.0000,0.0000,0.1185]), 12:array([0.0000,0.0000,0.5647,0.0000,0.2084])}
* My python v2.6.5 doesn't seem to let me do this.*
If assume that the above operation works, i want to perform dot product or matrix product to find the similarity between pairs of documents. My idea is to arrange the array in 3x5 matrix and multiply by its transpose which is 5x3. This will return a 3x3 matrix which tells me the relationship between two documents. for example:
[ 5:[0.6841,0.2525,0.0000,0.0000,0.0000],
1:[0.0000, 0.2372,0.0000,0.0000,0.1185],
12:[0.0000,0.0000,0.5647,0.0000,0.2084] ]
and multiply by its transpose( i am not sure how to do that) and the result will be 3x3 matrix that corresponded to "DOCNO" by "DOCNO".
Bottom line is i need to be able to retrieve the DOCNO. For example (5,1) shows the relationship between document 5 and 1. Or ( 1,12) shows the relationship between document 1 and 12. I am not sure whether this is possible in python but other similar resolution will be appreciated. Thanks for your time.