I'm working with biological data - namely groups of genes. For example:
group 1: geneA geneB geneC
group 2: geneD geneE
group 3: geneF geneG geneH
For each pair of genes, geneX
and geneY
I have a score telling how similiar the two genes are (actually, I have two scores, since I used BLAST which is 'directional': I first searched geneX
against all the other genes then geneY
against all the other genes, so I have two geneX--geneY
scores, but I guess I can take the lower score of the two, or the average).
So, let's suppose I have only one score for each pair of genes. My data can be viewed as a undirected graph:
and recall each edge has a score attached to it.
Now, what I would like to do is:
Visualize my data interactively: being able to click on gene nodes and open a link attached to them, show only edges above/below some threshold, control how the network is "spread", etc.
Cluster together groups which are similar, i.e. groups that have similar genes.
Any ideas of how can I do that? I guess it's basic clustering and I would appreciate any hints on packages/software that can be of any help here.
Thank you.