I am parsing using a pretty large grammar (1.1 GB, it's data-oriented parsing). The parser I use (bitpar) is said to be optimized for highly ambiguous grammars. I'm getting this error:
1terminate called after throwing an instance of 'std::bad_alloc'
what(): St9bad_alloc
dotest.sh: line 11: 16686 Aborted bitpar -p -b 1 -s top -u unknownwordsm -w pos.dfsa /tmp/gsyntax.pcfg /tmp/gsyntax.lex arbobanko.test arbobanko.results
Is there hope? Does it mean that it has ran out of memory? It uses about 15 GB before it crashes. The machine I'm using has 32 GB of RAM, plus swap as well. It crashes before outputting a single parse tree; I think it crashes after reading the grammar, during an attempt to construct a chart parse for the first sentence.
The parser is an efficient CYK chart parser using bit vector representations; I presume it is already pretty memory efficient. If it really requires too much memory I could sample from the grammar rules, but this will decrease parse accuracy of course.
I think the problem is probably that I have a very large number of non-terminals, I should probably try to look for a different parser (any suggestions?)