views:

375

answers:

4

I have a data frame detailing edge weights among N nodes. Is there a package for working with this sort of data?

For example, I would like to plot the following information as a network:

  p1 p2 counts
1  a  b    100
2  a  c    200
3  a  d    100
4  b  c     80
5  b  d     90
6  b  e    100
7  c  d    100
8  c  e     40
9  d  e     60
+7  A: 

One option is the network package, part of the statnet family of R packages for statistical social network analysis. It handles network data in a sparse way, which is nice for larger data sets.

Below, I do the following:

  • load the edgelist (the first two columns) into a network object
  • assign the counts to be an edge attribute called weight.
  • plot the network with gplot. (See the help page for changing the thickness of the edges.)
  • plot a sociomatrix (just a 5x5 set of blocks representing the adjacency matrix, where the (i,j) cell is shaded by the relative count)
A = read.table(file="so.txt",header=T)
A
      p1 p2 counts
    1  a  b    100
    2  a  c    200
    3  a  d    100
    4  b  c     80
    5  b  d     90
    6  b  e    100
    7  c  d    100
    8  c  e     40
    9  d  e     60

library(network)
net = network(A[,1:2])
# Get summary information about your network
net
     Network attributes:
      vertices = 5 
      directed = TRUE 
      hyper = FALSE 
      loops = FALSE 
      multiple = FALSE 
      bipartite = FALSE 
      total edges= 9 
        missing edges= 0 
        non-missing edges= 9 
        Vertex attribute names: 
        vertex.names 
     adjacency matrix:
      a b c d e
    a 0 1 1 1 0
    b 0 0 1 1 1
    c 0 0 0 1 1
    d 0 0 0 0 1
    e 0 0 0 0 0

set.edge.attribute(net,"weight",A[,3])
gplot(net)

## Another cool feature
s = as.sociomatrix(net,attrname="weight")
plot.sociomatrix(s)
Christopher DuBois
+1  A: 

Here's how to make a network plot of the data in igraph:

d <- data.frame(p1=c('a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'd'),
                p2=c('b', 'c', 'd', 'c', 'd', 'e', 'd', 'e', 'e'),
                counts=c(100, 200, 100,80, 90,100, 100,40,60))

library(igraph)
g <- graph.data.frame(d, directed=TRUE)
print(g, e=TRUE, v=TRUE)
tkplot(g, vertex.label=V(g)$name)
ars
A: 

I've also been working in igraph. One way to create a graph is to write out a list of all "from" "to" nodes to a text file a read it back in as a graph object. The graph object can be subjected to many graph theoretic processes and can handle quite large networks.

kpierce8
A: 

In my experience, igraph is my favorite package for large, graph-theoretic work. It is memory efficient and has some very good algorithms. igraph uses an internal edgelist-like data structure.
For simpler/smaller things I tend to use the 'sna' package ("social network analysis"). It's great for interactive work and plotting of smaller networks. sna uses more of an adjacency-matrix data structure.

Peter McMahan
The network package also has a sparse implementation.
Christopher DuBois