How would one implement this data structure in C. It is a structure similar to the DAWG, that is more efficient than the trie which only compresses prefixes. The CDAWG is a data structure twice as space efficient as the DAWG.
views:
781answers:
3Well like any DAG, I think you would have to choose between a matrix implementation or a linked list implementation. The matrix implementation is only appropriate for very dense graphs.
I'm unfamiliar with the particular data structure you are describing, and a quick Google for cdawg algorithm turns up lots of gated academic articles. Do you have a good description of how the algorithm works?
From what I can see from this paper
It's a trie with suffix compression to reduce the final state changes for a match, since I'd worked on something similar that, I had also considered doing that to save space. This was the solution I had thought of for the data structure, I'm interested to see if there are other approaches:
struct cdawg
{
int issuffix:1;
int length:31;
char *s; // suffix if issuffix == 1, else array of valid transition chars
struct cdawg *trans; // array of next states based on the index of trans char in s, null if suffix
};
Here's and algorithm to construct CDAWG
On-Line Construction of Compact Directed Acyclic Word Graphs