microth / pathlstm Goto Github PK
View Code? Open in Web Editor NEWNeural SRL model
Neural SRL model
Hi, Micheal
For example, the result of sentence "My cat is sitting on my book." label "cat" and "book" to be cat.01 and book.01. But cat and book are nouns in this sentence (which is correctly identified in the POS column).
Shouldn't each predicate be a VB*? I am quite confused.
Bill
Hey microth,
After fixing the previous bugs, I had a new issue.
My command :
java -Xmx40g -cp "libs/anna-3.3.jar:target/classes" se.lth.cs.srl.CompletePipeline eng -lemma models/CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer.model -tagger models/CoNLL2009-ST-English-ALL.anna-3.3.postagger.model -parser models/CoNLL2009-ST-English-ALL.anna-3.3.parser.model -srl models/srl-ACL2016-eng.model -tokenize -reranker -externalNNs -test models/text_to_parse.txt
The output :
9.40.461 is2.data.ParametersFloat 121:read -> read parameters 134217727 not zero 296071
9.40.469 is2.data.Cluster 113:<init> -> Read cluster with 0 words
9.40.469 is2.lemmatizer.Lemmatizer 192:readModel -> Loading data finished.
9.40.469 is2.lemmatizer.Lemmatizer 194:readModel -> number of params 134217727
9.40.469 is2.lemmatizer.Lemmatizer 195:readModel -> number of classes 92
9.44.359 is2.data.ParametersFloat 121:read -> read parameters 134217727 not zero 1613201
9.44.359 is2.data.Cluster 113:<init> -> Read cluster with 0 words
9.44.360 is2.tag.Lexicon 103:<init> -> Read lexicon with 0 words
9.44.360 is2.tag.Tagger 141:readModel -> Loading data finished.
9.44.462 is2.parser.Parser 188:readModel -> Reading data started
9.44.518 is2.data.Cluster 113:<init> -> Read cluster with 0 words
9.49.103 is2.parser.ParametersFloat 101:read -> read parameters 134217727 not zero 19957525
9.49.103 is2.parser.Parser 201:readModel -> parsing -- li size 134217727
9.49.108 is2.parser.Parser 211:readModel -> Stacking false
9.49.108 is2.parser.Extractor 56:initStat -> mult (d4)
Used parser class is2.parser.Parser
Creation date 2012.11.02 14:33:53
Training data CoNLL2009-ST-English-ALL.txt.crossannotated
Iterations 10 Used sentences 10000000
Cluster null
9.49.110 is2.parser.Parser 240:readModel -> Reading data finnished
9.49.110 is2.parser.Extractor 56:initStat -> mult (d4)
Loading pipeline from models/srl-ACL2016-eng.model
Loading reranker from models/srl-ACL2016-eng.model
Writing corpus to out.txt...
Exception in thread "main" java.lang.NoClassDefFoundError: edu/stanford/nlp/process/PTBTokenizer
at se.lth.cs.srl.preprocessor.tokenization.StanfordPTBTokenizer.tokenizeplus(StanfordPTBTokenizer.java:35)
at se.lth.cs.srl.preprocessor.Preprocessor.tokenizeplus(Preprocessor.java:37)
at se.lth.cs.srl.CompletePipeline.parse(CompletePipeline.java:73)
at se.lth.cs.srl.CompletePipeline.parseNonSegmentedLineByLine(CompletePipeline.java:165)
at se.lth.cs.srl.CompletePipeline.main(CompletePipeline.java:138)
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.process.PTBTokenizer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
Concerning the downloaded files, stanford-corenlp-full-2016-10-31 is unzipped into the libs/ sub-directory.
Best,
Julien
Where can i download the models from as stated below?
LEMMA_MODEL=models/lemma-eng.model
POS_MODEL=models/tagger-eng.model
PARSER_MODEL=models/parse-eng.model
SRL_MODEL=models/srl-ACL2016-eng.model
Hello,
I didn't really understand how to use the parse.sh
script for FrameNet SRL. I've all the models and libraries, including BISTparser and NLP4J, and FrameNet 1.5 (downloaded using NLTK).
From here, I'm pretty lost. How to retrain BISTparser and NLP4J using the 10-fold jackknifing, to recreate your results? Do I even have to do that?
Sorry if it's a newbie question, thank your for your time!
I got the error above in two cases: when there are empty lines in the input file (so I got rid of them), and again immediately after getting ERROR: sentence length mismatches token number in Stanford annotation
, maybe it has something to do with one of the words in that sentence being "voila" with an accented letter "a".
Is there a flag I can pass so that the pipeline will silently ignore such errors? On the same note, I've 23M sentences to label - do you think it's better to split them to N files and run N processes for parse_fn.sh
, or I should stick to my current 1 file with 23M sentences?
Thanks!
Hello Mike,
I have trouble with parse method when model is loaded, it throws null pointer wxception in LibLinearModel.java on line 43, here is my stack trace:
`java.lang.NullPointerException
at uk.ac.ed.inf.srl.ml.liblinear.LibLinearModel.classify(LibLinearModel.
java:48)
at se.lth.cs.srl.pipeline.ArgumentStep.classifyInstance(ArgumentStep.jav
a:143)
at se.lth.cs.srl.pipeline.ArgumentIdentifier.parse(ArgumentIdentifier.ja
va:43)
at se.lth.cs.srl.pipeline.Pipeline.parse(Pipeline.java:104)
at se.lth.cs.srl.SemanticRoleLabeler.parseSentence(SemanticRoleLabeler.j
ava:12)
at se.lth.cs.srl.CompletePipeline.parse(CompletePipeline.java:79)
at scroll.mate.MateAnnotations.parse(MateAnnotations.java:36)`
Thanks in advance,
Daniel
I tried directly using "pathlstm.jar" as I was unable to compile through "mvn compile". I am getting an error. Can you please tell me if I am doing something wrong as soon as possible.
java -Xmx40g -cp libs/anna-3.3.jar:target/pathlstm.jar se.lth.cs.srl.CompletePipeline eng -lemma models/CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer.model -tagger models/CoNLL2009-ST-English-ALL.anna-3.3.postagger.model -parser models/CoNLL2009-ST-English-ALL.anna-3.3.parser.model -srl models/srl-ACL2016-eng.model -tokenize -reranker -externalNNs -test sample.txt
54.21.744 is2.data.ParametersFloat 121:read -> read parameters 134217727 not zero 296071
54.21.763 is2.data.Cluster 113: -> Read cluster with 0 words
54.21.764 is2.lemmatizer.Lemmatizer 192:readModel -> Loading data finished.
54.21.764 is2.lemmatizer.Lemmatizer 194:readModel -> number of params 134217727
54.21.765 is2.lemmatizer.Lemmatizer 195:readModel -> number of classes 92
54.26.6 is2.data.ParametersFloat 121:read -> read parameters 134217727 not zero 1613201
54.26.6 is2.data.Cluster 113: -> Read cluster with 0 words
54.26.7 is2.tag.Lexicon 103: -> Read lexicon with 0 words
54.26.7 is2.tag.Tagger 141:readModel -> Loading data finished.
54.26.55 is2.parser.Parser 188:readModel -> Reading data started
54.26.102 is2.data.Cluster 113: -> Read cluster with 0 words
54.31.336 is2.parser.ParametersFloat 101:read -> read parameters 134217727 not zero 19957525
54.31.336 is2.parser.Parser 201:readModel -> parsing -- li size 134217727
54.31.354 is2.parser.Parser 211:readModel -> Stacking false
54.31.355 is2.parser.Extractor 56:initStat -> mult (d4)
Used parser class is2.parser.Parser
Creation date 2012.11.02 14:33:53
Training data CoNLL2009-ST-English-ALL.txt.crossannotated
Iterations 10 Used sentences 10000000
Cluster null
54.31.361 is2.parser.Parser 240:readModel -> Reading data finnished
54.31.363 is2.parser.Extractor 56:initStat -> mult (d4)
Loading pipeline from models/srl-ACL2016-eng.model
Loading reranker from models/srl-ACL2016-eng.model
Writing corpus to out.txt...
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
PTBTokenizer cannot be resolved to a type
Word cannot be resolved to a type
PTBTokenizer cannot be resolved
Word cannot be resolved to a type
at se.lth.cs.srl.preprocessor.tokenization.StanfordPTBTokenizer.tokenizeplus(StanfordPTBTokenizer.java:35)
at se.lth.cs.srl.preprocessor.Preprocessor.tokenizeplus(Preprocessor.java:37)
at se.lth.cs.srl.CompletePipeline.parse(CompletePipeline.java:73)
at se.lth.cs.srl.CompletePipeline.parseNonSegmentedLineByLine(CompletePipeline.java:165)
at se.lth.cs.srl.CompletePipeline.main(CompletePipeline.java:138)
Hello @microth ,
I have tried running PathLSTM but when running it seems some classes are missing, could you instruct me any further?
Loading pipeline from models\srl-ACL2016-eng.model
java.lang.ClassNotFoundException: uk.ac.ed.inf.srl.features.FeatureGenerator
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Unknown Source)
at java.io.ObjectInputStream.resolveClass(Unknown Source)
at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source)
at java.io.ObjectInputStream.readClassDesc(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at se.lth.cs.srl.pipeline.Pipeline.fromZipFile(Pipeline.java:192)
at se.lth.cs.srl.pipeline.Pipeline.fromZipFile(Pipeline.java:226)
at se.lth.cs.srl.pipeline.Reranker.<init>(Reranker.java:63)
at se.lth.cs.srl.CompletePipeline.getCompletePipeline(CompletePipeline.java:52)
at se.lth.cs.srl.CompletePipeline.main(CompletePipeline.java:122)
Writing corpus to out.txt...
Exception in thread "main" java.lang.NullPointerException
at se.lth.cs.srl.pipeline.Reranker.parse(Reranker.java:96)
at se.lth.cs.srl.SemanticRoleLabeler.parseSentence(SemanticRoleLabeler.java:12)
at se.lth.cs.srl.CompletePipeline.parseX(CompletePipeline.java:93)
at se.lth.cs.srl.CompletePipeline.parse(CompletePipeline.java:73)
at se.lth.cs.srl.CompletePipeline.parseNonSegmentedLineByLine(CompletePipeline.java:165)
at se.lth.cs.srl.CompletePipeline.main(CompletePipeline.java:138)
I used the pre Illinois built branch because the main branch gives me:
[INFO] 16 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 7.809s
[INFO] Finished at: Sat Apr 29 18:18:38 CEST 2017
[INFO] Final Memory: 22M/346M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.6.0:compile (default-compile) on project PathLSTM: Compilation failure: Compilation failure:
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[6,44] package edu.illinois.cs.cogcomp.chunker.main does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[12,40] package edu.illinois.cs.cogcomp.depparse does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[16,46] package edu.illinois.cs.cogcomp.nlp.lemmatizer does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[17,35] package edu.illinois.cs.cogcomp.pos does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[6,44] package edu.illinois.cs.cogcomp.chunker.main does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[11,40] package edu.illinois.cs.cogcomp.depparse does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[12,46] package edu.illinois.cs.cogcomp.nlp.lemmatizer does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[13,35] package edu.illinois.cs.cogcomp.pos does not exist
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[44,37] cannot find symbol
[ERROR] symbol: class POSAnnotator
[ERROR] location: class se.lth.cs.srl.pipeline.LBJavaArgumentClassifier
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[45,39] cannot find symbol
[ERROR] symbol: class IllinoisLemmatizer
[ERROR] location: class se.lth.cs.srl.pipeline.LBJavaArgumentClassifier
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[46,39] cannot find symbol
[ERROR] symbol: class ChunkerAnnotator
[ERROR] location: class se.lth.cs.srl.pipeline.LBJavaArgumentClassifier
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/pipeline/LBJavaArgumentClassifier.java:[47,40] cannot find symbol
[ERROR] symbol: class DepAnnotator
[ERROR] location: class se.lth.cs.srl.pipeline.LBJavaArgumentClassifier
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[28,30] cannot find symbol
[ERROR] symbol: class DepAnnotator
[ERROR] location: class se.lth.cs.srl.preprocessor.IllinoisPreprocessor
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[29,27] cannot find symbol
[ERROR] symbol: class POSAnnotator
[ERROR] location: class se.lth.cs.srl.preprocessor.IllinoisPreprocessor
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[30,29] cannot find symbol
[ERROR] symbol: class IllinoisLemmatizer
[ERROR] location: class se.lth.cs.srl.preprocessor.IllinoisPreprocessor
[ERROR] /C:/Users/Daniel/Desktop/PathLSTM-master/src/main/java/se/lth/cs/srl/preprocessor/IllinoisPreprocessor.java:[31,29] cannot find symbol
[ERROR] symbol: class ChunkerAnnotator
[ERROR] location: class se.lth.cs.srl.preprocessor.IllinoisPreprocessor
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Best regards,
Daniel
Would anything break if I use version 3.7.0 for Stanford dependency?
Hey again, @microth :)
I'm sorry to keep raising issues for problems this trivial, but its really hard to find help related to this package. Its just my deficiency in understanding certain subjects I'm hoping you can help me with.
Uh, I was successful in using the complete pipeline on a sentence, and the result was this
I just wanted to know how I could get the FrameNet kind of semantic annotations, on the input text, like shown below. (Taken from the SEMAFOR Demo page.)
Like, 'born' has a Being_Born semantic frame to it, and I'd like to have my input text annotated as shown in the second picture. Is it possible? If you don't mind, could you guide me on how I could achieve this? Below are the arguments I passed
CompletePipelineCMDLineOptions options = new CompletePipelineCMDLineOptions(); String[] arss = {"eng", "-lemma", "/Users/vishnumohan/Desktop/LTh/PathLSTM-pre-illinois-built/src/main/java/se/lth/cs/srl/models/CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer.model", "-tagger", "/Users/vishnumohan/Desktop/LTh/PathLSTM-pre-illinois-built/src/main/java/se/lth/cs/srl/models/CoNLL2009-ST-English-ALL.anna-3.3.postagger.model", "-parser", "/Users/vishnumohan/Desktop/LTh/PathLSTM-pre-illinois-built/src/main/java/se/lth/cs/srl/models/CoNLL2009-ST-English-ALL.anna-3.3.parser.model", "-srl", "/Users/vishnumohan/Desktop/LTh/PathLSTM-pre-illinois-built/srl-FN17.model", "-framenet", "/Users/vishnumohan/Desktop/LTh/fndata-1.7", "-tokenize", "-reranker", "-externalNNs", "-test", "/Users/vishnumohan/Desktop/LTh/PathLSTM-pre-illinois-built/src/main/java/se/lth/cs/srl/tesen.txt"}; options.parseCmdLineArgs(arss);
After reading through the docs, I also saw people mentioning a srl-ICCG16-eng.model
file. Could you provide me with a link, for the same?
Best Regards,
Vishnu
If you please, could you provide a downloadable reference for the other model files referenced in scripts.parse.sh? Specifically, LEMMA_MODEL, POS_MODEL, and PARSER_MODEL.
Hello,
I downloaded all dependencies and tried to mvn compile
, but getting the above error.
A simple grep showed that 45 source files are importing this non-existing package. I tried switching to master branch, just to be sure, but no luck.
Thank you for your time :)
I tried running the scripts/parse.sh
script using the FrameNet model (srl-ICCG16-eng.model
), but I got the error: java.io.InvalidObjectException: enum constant PathEmbeddingacN_FNET_seed3 does not exist in class uk.ac.ed.inf.srl.features.FeatureName. The other model (srl-ACL2016-eng.model
) works fine, but the FrameNet model does not. I have SRL_MODEL=models/srl-ICCG16-eng.model
in scripts/parse.sh
and the FrameNet data in models/fndata-1.5/
. I'm not sure where I should specify where the FrameNet data is though, or if I need to do that at all. Could the problem be that I didn't add something to the classpath? I haven't added anything because the other model worked fine without it.
Here is the stacktrace:
scripts/parse.sh tests/testParse1In.txt
54.53.188 is2.data.ParametersFloat 121:read -> read parameters 134217727 not zero 296071
54.53.196 is2.data.Cluster 113: -> Read cluster with 0 words
54.53.196 is2.lemmatizer.Lemmatizer 192:readModel -> Loading data finished.
54.53.197 is2.lemmatizer.Lemmatizer 194:readModel -> number of params 134217727
54.53.197 is2.lemmatizer.Lemmatizer 195:readModel -> number of classes 92
54.59.358 is2.data.ParametersFloat 121:read -> read parameters 134217727 not zero 1613201
54.59.359 is2.data.Cluster 113: -> Read cluster with 0 words
54.59.362 is2.tag.Lexicon 103: -> Read lexicon with 0 words
54.59.364 is2.tag.Tagger 141:readModel -> Loading data finished.
54.59.391 is2.parser.Parser 188:readModel -> Reading data started
54.59.431 is2.data.Cluster 113: -> Read cluster with 0 words
55.6.812 is2.parser.ParametersFloat 101:read -> read parameters 134217727 not zero 19957525
55.6.814 is2.parser.Parser 201:readModel -> parsing -- li size 134217727
55.6.826 is2.parser.Parser 211:readModel -> Stacking false
55.6.827 is2.parser.Extractor 56:initStat -> mult (d4)
Used parser class is2.parser.Parser
Creation date 2012.11.02 14:33:53
Training data CoNLL2009-ST-English-ALL.txt.crossannotated
Iterations 10 Used sentences 10000000
Cluster null
55.6.830 is2.parser.Parser 240:readModel -> Reading data finnished
55.6.831 is2.parser.Extractor 56:initStat -> mult (d4)
Loading pipeline from models/srl-ICCG16-eng.model
java.io.InvalidObjectException: enum constant PathEmbeddingacN_FNET_seed3 does not exist in class uk.ac.ed.inf.srl.features.FeatureName
at java.io.ObjectInputStream.readEnum(ObjectInputStream.java:1746)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1348)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at java.util.HashMap.readObject(HashMap.java:1394)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at se.lth.cs.srl.pipeline.Pipeline.fromZipFile(Pipeline.java:192)
at se.lth.cs.srl.pipeline.Pipeline.fromZipFile(Pipeline.java:226)
at se.lth.cs.srl.pipeline.Reranker.(Reranker.java:63)
at se.lth.cs.srl.CompletePipeline.getCompletePipeline(CompletePipeline.java:52)
at se.lth.cs.srl.CompletePipeline.main(CompletePipeline.java:122)
Caused by: java.lang.IllegalArgumentException: No enum constant uk.ac.ed.inf.srl.features.FeatureName.PathEmbeddingacN_FNET_seed3
at java.lang.Enum.valueOf(Enum.java:238)
at java.io.ObjectInputStream.readEnum(ObjectInputStream.java:1743)
... 21 more
Writing corpus to out.txt...
Exception in thread "main" java.lang.NullPointerException
at se.lth.cs.srl.pipeline.Reranker.parse(Reranker.java:96)
at se.lth.cs.srl.SemanticRoleLabeler.parseSentence(SemanticRoleLabeler.java:12)
at se.lth.cs.srl.CompletePipeline.parseX(CompletePipeline.java:93)
at se.lth.cs.srl.CompletePipeline.parse(CompletePipeline.java:73)
at se.lth.cs.srl.CompletePipeline.parseNonSegmentedLineByLine(CompletePipeline.java:165)
at se.lth.cs.srl.CompletePipeline.main(CompletePipeline.java:138)
Hi, Michael.
We are wondering if you can tell us how to know where the last term of a argument is.
For example, "My room contains a book, a dog and a cat."
In the demo web site, we can see that "a book, a dog and a cat" is the A1 of contain.01.
But in the table below (and out.txt) we cannot know that the cat is the last word of A1.
Should I change the code or something?
Thank you very much.
In a short test, I tried the sentences below to ascertain the accuracy of PathLSTM PropBank/NomBank for verb/nun sense. As for find out and go, the results seem puzzling and I wonder if there is something wrong with my setup, as these verbs should be find.03 and go.02 respectively:
The the the DT DT _ _ 2 2 NMOD NMOD _ _ _
2 waitress waitress waitress NN NN _ _ 3 3 SBJ SBJ _ _ A0
3 found found found VBD VBD _ _ 0 0 ROOT ROOT Y find.01 _
4 out out out RP RP _ _ 3 3 PRT PRT _ _ _
5 that that that IN IN _ _ 3 3 OBJ OBJ _ _ A1
6 she she she PRP PRP _ _ 7 7 SBJ SBJ _ _ _
7 was be be VBD VBD _ _ 5 5 SUB SUB _ _ _
8 fat fat fat JJ JJ _ _ 7 7 PRD PRD _ _ _
9 . . . . . _ _ 3 3 P P _ _ _
1 We we we PRP PRP _ _ 2 2 SBJ SBJ _ _ A1
2 are be be VBP VBP _ _ 0 0 ROOT ROOT _ _ _
3 going go go VBG VBG _ _ 2 2 VC VC Y go.01 _
4 on on on IN IN _ _ 3 3 ADV ADV _ _ _
5 vacation vacation vacation NN NN _ _ 4 4 PMOD PMOD _ _ _
6 to to to TO TO _ _ 3 3 DIR DIR _ _ A4
7 Singapore singapore singapore NNP NNP _ _ 6 6 PMOD PMOD _ _ _
8 . . . . . _ _ 2 2 P P _ _ _
I used the following models:
srl-ACL2016-eng
CoNLL2009-ST-English-ALL.anna-3.3.lemmatizer
CoNLL2009-ST-English-ALL.anna-3.3.parser
CoNLL2009-ST-English-ALL.anna-3.3.postagger
stanford-corenlp-3.7.0
Hi,
I'm working on a Ubuntu 16:04.
I created the two subdirectories libs/ and models/, where I put the librairies and models.
When I execute :
.../PathLSTM$ ./scripts/parse.sh
the following error appears:
Could not find or load main class se.lth.cs.srl.CompletePipeline
Any idea to fix it?
Best wishes for the new year,
Julien
Hi,
Its a great effort towards SRL Task ๐ I was looking for getting dependency path embeddings instead of end-to-end SRL pipeline. Could you please share or guide in achieving sample code which illustrates generation of dependency path embeddings with some sample text input. (may be dig into intermediate steps as well)
It would be great learning more about dependency path embeddings from you. Looking forward from you !!
Hey, @microth
I've recently started using the PathLSTM, for an application which requires srl, but the sheer size of the model file ( srl-ACL2016-eng.model || 2.7G ), throws me an insufficient heap space - Out of memory error
Loading pipeline from
C:\Users\Vyso\Downloads\NLP\SRL\SEMAFOR\absSemafor\LTH\wttv\PathLSTM-pre-
illinois-built\srl-ACL2016-eng.model
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.lang.reflect.Array.newInstance(Array.java:75)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1883)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1529)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1919)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1529)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1919)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1529)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
Process finished with exit code 1
So naturally, I changed the heap space of both the JVM on my system, and my IDE, where I could change the vmoptions, as follows. (It was -Xms128m and -Xmx512m, by default).
custom IntelliJ IDEA VM options
-Xms2048m
-Xmx4000m
-XX:ReservedCodeCacheSize=240m
-XX:+UseConcMarkSweepGC
-XX:SoftRefLRUPolicyMSPerMB=50
-ea
-Dsun.io.useCanonCaches=false
-Djava.net.preferIPv4Stack=true
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
But even after assigning around 4G as Max Heap space, I get the error. Funny thing is, in the memory management toolbar of my IDE, I can see that the code uses a max of just 500m, during runtime, so I really don't know how this heap space error is still getting thrown.
Could you tell me if this is unusual an error with this algorithm, or if you've seen it before, too?
Also, should I add all the mentioned dependencies just for the parse class to function? Is there a probability that this error gets thrown because of incomplete dependency additions?
Maybe it's just a beginner level mistake from my side, but I've been trying to get out out of this problem for quite a few days now, and I'd really appreciate it if you could instruct me on how I could get rid of this error.
Thank You,
Vishnu
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.