views:

781

answers:

3

How would one implement this data structure in C. It is a structure similar to the DAWG, that is more efficient than the trie which only compresses prefixes. The CDAWG is a data structure twice as space efficient as the DAWG.

A: 

Well like any DAG, I think you would have to choose between a matrix implementation or a linked list implementation. The matrix implementation is only appropriate for very dense graphs.

I'm unfamiliar with the particular data structure you are describing, and a quick Google for cdawg algorithm turns up lots of gated academic articles. Do you have a good description of how the algorithm works?

Kip
+2  A: 

From what I can see from this paper

It's a trie with suffix compression to reduce the final state changes for a match, since I'd worked on something similar that, I had also considered doing that to save space. This was the solution I had thought of for the data structure, I'm interested to see if there are other approaches:

struct cdawg
{
 int issuffix:1;
 int length:31;
 char *s; // suffix if issuffix == 1, else array of valid transition chars
 struct cdawg *trans; // array of next states based on the index of trans char in s, null if suffix
};
sfossen
A: 

Here's and algorithm to construct CDAWG

On-Line Construction of Compact Directed Acyclic Word Graphs

lalitm