I've written an algorithm which calculates and stores all paths of a DAG, and it works nicely on small graphs - but now i'm looking to improve it's efficiency to run over larger graphs. The core logic of the algorithm is in createSF() and makePathList() below, the other methods are helpers - I can see that append is a bottleneck. However, I guess the biggest help would be to devise a data structure that can store paths in a dictionary, since many of the paths are composed of other paths, this is the crux of my question.
private Multiset<String> paths = new Multiset<String>();
public Multiset<String> createSF(DAGNode n) {
for (DAGNode succ : n.getSuccessors())
createSF(succ);
if (!n.isVisited())
for (String s : makePathList(n))
paths.put(s);
n.setVisited(true);
return paths;
}
private List<String> makePathList(DAGNode n) {
List<String> list = new ArrayList<String>();
list.add(n.getLabel());
for (DAGNode node : n.getSuccessors())
list.addAll(append(n.getLabel(), makePathList(node)));
return list;
}
private List<String> append(String s, List<String> src) {
List<String> ls = new ArrayList<String>();
for (String str : src)
ls.add(s + "/" + str);
return ls;
}
EDIT:
I'm now using a path object to represent paths and have pin-pointed the bottle neck to these two methods:
public List<Path> createPathList(Tree n) {
List<Path> list = new ArrayList<Path>();
list.add(new Path(n.getNodeName()));
for (Tree node : n.getSuccessors()) {
list.addAll(append(n.getNodeName(), createPathList(node)));
}
return list;
}
public List<Path> append(String s, List<Path> src) {
List<Path> ls = new ArrayList<Path>();
for (Path path : src) {
ls.add(new Path(path, s));
}
return ls;
}
Trouble is when a graph is size M these methods will be called M times, this means there is a lot of lists being created here... is there a more efficient way to build up the return for createPathList()?