views:

44

answers:

2

I've to tag sentence using stanford parser

for each sentence i load the EnglishPCFGrammer file and find the tags using stanford parser, it works good for single sentence but when i give multiple sentences, i get this exception.. someone help


Loading parser from serialized file englishPCFG.ser.gz ... done [7.7 sec].
Exception in thread "AWT-EventQueue-1" java.lang.OutOfMemoryError: Java heap space
        at edu.stanford.nlp.parser.lexparser.ExhaustivePCFGParser.createArrays(ExhaustivePCFGParser.java:2056)
        at edu.stanford.nlp.parser.lexparser.ExhaustivePCFGParser.considerCreatingArrays(ExhaustivePCFGParser.java:2027)
        at edu.stanford.nlp.parser.lexparser.ExhaustivePCFGParser.parse(ExhaustivePCFGParser.java:315)
        at edu.stanford.nlp.parser.lexparser.LexicalizedParser.parse(LexicalizedParser.java:375)
        at edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:279)
        at finalproj.logic.SentenceTagger.tagSentence(SentenceTagger.java:25)
        at finalproj.logic.SentenceToXMLWriter.addSentence(SentenceToXMLWriter.java:62)
        at finalproj.logic.SentenceSplitter.splitFile(SentenceSplitter.java:24)
        at finalproj.ui.LoadReqFile.jBtnSentenceSplitActionPerformed(LoadReqFile.java:126)
        at finalproj.ui.LoadReqFile.access$200(LoadReqFile.java:21)
        at finalproj.ui.LoadReqFile$3.actionPerformed(LoadReqFile.java:70)
        at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1995)
        at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2318)
        at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:387)
        at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:242)
        at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(BasicButtonListener.java:236)
        at java.awt.Component.processMouseEvent(Component.java:6038)
        at javax.swing.JComponent.processMouseEvent(JComponent.java:3260)
        at java.awt.Component.processEvent(Component.java:5803)
        at java.awt.Container.processEvent(Container.java:2058)
        at java.awt.Component.dispatchEventImpl(Component.java:4410)
        at java.awt.Container.dispatchEventImpl(Container.java:2116)
        at java.awt.Component.dispatchEvent(Component.java:4240)
        at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4322)
        at java.awt.LightweightDispatcher.processMouseEvent(Container.java:3986)
        at java.awt.LightweightDispatcher.dispatchEvent(Container.java:3916)
        at java.awt.Container.dispatchEventImpl(Container.java:2102)
        at java.awt.Component.dispatchEvent(Component.java:4240)
        at java.awt.EventQueue.dispatchEvent(EventQueue.java:599)
        at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:273)
        at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:183)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:173)
A: 

You need to start java with extra parameters to increase heap space. Initial space is 64MB, which is obviously not sufficient:

java -Xms128m -Xmx256m

will start with 128 MByte (initial heap size) and allow a maximum heap of 256Myte. You may need some experimentation.


Note - the 'about' page of the stanford parser contains a short java command line example (very last line) where they use a heap size of 200MByte. I'd take this as a hint that the parser itself requires more then 64MByte memory. (additional note: this command line example has a syntax error, my example from above is correct).

Andreas_D
A: 

for each sentence i load the EnglishPCFGrammer file

You definitely don't want to be loading a file for each sentence. You only need to load it once - at the beginning of your parsing process, and simply reference it for each sentence.
It sounds like loading the File for each sentence would certainly cause an OutOfMemoryError.

Shakedown