ansaurus

Question

Follow-up to question on iterating over a graph using XML minidom

Answer 1

A:

I'm unsure about how to connect authors to each other.

You need to generate (author, otherauthor) pairs so you can add them as edges. The typical way to do that would be a nested iteration:

for thing in things:
    for otherthing in things:
        add_edge(thing, otherthing)

This is a naïve implementation that includes self-loops (giving an author an edge connecting himself to himself), which you may or may not want; it also includes both (1,2) and (2,1), which if you're doing an undirected graph is redundant. (In Python 2.6, the built-in permutations generator also does this.) Here's a generator that fixes these things:

def pairs(l):
    for i in range(len(l)-1):
        for j in range(i+1, len(l)):
            yield l[i], l[j]

I've not used NetworkX, but looking at the doc it seems to say you can call add_node on the same node twice (with nothing happening the second time). If so, you can discard the dict you were using to try to keep track of what nodes you'd inserted. Also, it seems to say that if you add an edge to an unknown node, it'll add that node for you automatically. So it should be possible to make the code much shorter:

for conference in dom.getElementsByTagName('conference'):
    var conf_name= node.getAttribute('name')
    for paper in conference.getElementsByTagName('paper'):
        authors= paper.getElementsByTagName('author')
        auth_names= [author.firstChild.data.split('(')[0] for author in authors]

        # Note author's conference attendance
        #
        for auth_name in auth_names:
            G.add_edge(auth_name, conf_name)

        # Note combinations of authors working on same paper
        #
        for auth_name, other_name in pairs(auth_names):
            G.add_edge(auth_name, otherauth_name)

bobince 2009-10-02 16:53:15

Answer 2

A:

im not entirely sure what you're looking for, but based on your description i threw together a graph which I think encapsulates the relationships you describe.

http://imgur.com/o2HvT.png

i used openfst to do this. i find it much easier to clearly layout the graphical relationships before plunging into the code for something like this.

also, do you actually need to generate an explicit edge between authors? this seems like a traversal issue.

blackkettle 2009-10-02 16:59:43

ansaurus

tags:

views:

answers:

Follow-up to question on iterating over a graph using XML minidom

related questions