Currently, log()
displays debugging information only when a Failure
object is returned.
It would be helpful for log()
to additionally display for Success
objects to display the text (or a summarized version of it) that a rule processed, for a couple of reasons:
- If an error arises from a rule over-generalizing and accepting too much input, a
Success
object is returned; it is very hard to locate this type of error without seeing the text the Success
object processed;
- The strings from
Success
objects make it easier to see where you are lokking in the parse.
These original text strings should not pollute the visual space with too much information, though, which would make log()
output hard to read. Thus, the strings should:
- be summarized if excessively long, yet contain enough info to be useful
- be on one line (i.e. remove newlines or 'vertical whitespace')
- be easy to scan, so breaking on whitespace would be helpful when possible
I've written a patch that does this, by using regexs to:
- change all whitespace strings to a single space
- then summarize a string if longer than 49 chars:
- require a certain amount of characters from the beginning and end
- and then break on next whitespace possible.
Although the string may be up to 49 chars long, in practice it is shorter than that due to breaking on whitespace. Here are some sample summarizations:
"Sticks and stones may break my bones but names will never hurt me."
0:66 Success: "sticks and"..."will never hurt me."
"I'm going to go to this shop to go shopping while she goes shopping at that shop."
0:81 Success: "I'm going to"..."shopping at that shop."
"We really may not be all that hungry since we ate a lot already."
0:64 Success: "we really may"..."ate a lot already."
"I've been studying a parser combinator library for scala because it might be useful for my projects."
0:100 Success: "I've been studying"..."for my projects."
"A newspaper reported that the store is going to plan new studies on the project."
0:80 Success: "a newspaper"..."on the project."
Here is how the Success strings look in debugging a program with log()
. The example program is a simple NLP chunker with an input sentence:
"A newspaper reported that the firm plans new studies on the project."
Without Success strings:
+s:0
+clause:0
+np:0
+adjP:2
-adjP:2:Failure(adjP:1:3 / adj:1:3 / ws:1:6 / (CharIn(" \t\n.;:?!").rep(1) | &(",") | End):1:6 ..."newspaper ")
+pp:12
-pp:12:Failure(pp:1:13 / prep:1:13 / StringIn("about", "above", "according to", "across", "after", "against", "around", "at", "before", "behind", "below", "beneath", "beside", "besides", "between", "beyond", "by", "by way of", "down", "during", "except", "for", "from", "in", "in addition to", "in front of", "in place of", "in regard to", "in spite of", "inside", "instead of", "into", "like", "near", "of", "off", "on", "on account of", "out", "out of", "outside", "over", "through", "throughout", "till", "to", "toward", "under", "until", "up", "upon", "with", "without"):1:13 ..."reported t")
-np:0:Success(12)
+vp:12
+vConj:12
-vConj:12:Success(21)
+pp:21
-pp:21:Failure(pp:1:22 / prep:1:22 / StringIn("about", "above", "according to", "across", "after", "against", "around", "at", "before", "behind", "below", "beneath", "beside", "besides", "between", "beyond", "by", "by way of", "down", "during", "except", "for", "from", "in", "in addition to", "in front of", "in place of", "in regard to", "in spite of", "inside", "instead of", "into", "like", "near", "of", "off", "on", "on account of", "out", "out of", "outside", "over", "through", "throughout", "till", "to", "toward", "under", "until", "up", "upon", "with", "without"):1:22 ..."that the f")
+np:21
+adjP:26
-adjP:26:Failure(adjP:1:27 / adj:1:27 / StringIn("big", "small", "fast", "slow", "new", "old", "next", "red", "blue", "green", "orange", "yellow", "white", "black", "grey", "silver", "gold", "good", "bad", "great", "awful", "cool", "awesome", "worthless", "useful", "clever", "smart", "dumb", "stupid", "ridiculous", "fun", "interesting", "boring", "hungry", "thirsty", "firm"):1:27 ..."the firm p")
-np:21:Failure(np:1:22 / (det.? ~ Logged(adjP,adjP,<function1>).? ~ n.rep(1) ~ Logged(pp,pp,<function1>).? | pronoun):1:22 ..."that the f")
-vp:12:Success(21)
-clause:0:Success(21)
+clauseConnector:21
-clauseConnector:21:Success(26)
+clause:26
+np:26
+adjP:30
-adjP:30:Success(35)
+pp:41
-pp:41:Failure(pp:1:42 / prep:1:42 / StringIn("about", "above", "according to", "across", "after", "against", "around", "at", "before", "behind", "below", "beneath", "beside", "besides", "between", "beyond", "by", "by way of", "down", "during", "except", "for", "from", "in", "in addition to", "in front of", "in place of", "in regard to", "in spite of", "inside", "instead of", "into", "like", "near", "of", "off", "on", "on account of", "out", "out of", "outside", "over", "through", "throughout", "till", "to", "toward", "under", "until", "up", "upon", "with", "without"):1:42 ..."new studie")
-np:26:Success(41)
+vp:41
+vConj:41
-vConj:41:Failure(vConj:1:42 / ("to" ~ ws ~ adv.rep ~ infinitive | "going ".? ~ "to" ~ ws ~ adv.rep ~ infinitive ~ presentParticiple.? | modalAuxiliary ~ adv.rep ~ infinitive | (("do" | "did" | "will") ~ (ws ~ adv).? | ("don't" | "didn't" | "won't") ~ ws) ~ infinitive | have ~ adv.rep ~ pastParticiple | be ~ adv.rep ~ presentParticiple | have ~ adv.rep ~ "been " ~ adv.rep ~ presentParticiple | presentTense | pastTense):1:42 ..."new studie")
-vp:41:Failure(vp:1:42 / vConj:1:42 / ("to" ~ ws ~ adv.rep ~ infinitive | "going ".? ~ "to" ~ ws ~ adv.rep ~ infinitive ~ presentParticiple.? | modalAuxiliary ~ adv.rep ~ infinitive | (("do" | "did" | "will") ~ (ws ~ adv).? | ("don't" | "didn't" | "won't") ~ ws) ~ infinitive | have ~ adv.rep ~ pastParticiple | be ~ adv.rep ~ presentParticiple | have ~ adv.rep ~ "been " ~ adv.rep ~ presentParticiple | presentTense | pastTense):1:42 ..."new studie")
+copulaP:41
-copulaP:41:Failure(copulaP:1:42 / be:1:42 / StringIn("am", "are", "is", "was", "were", "will be", "be", "'m", "'s", "'re", "'ll"):1:42 ..."new studie")
-clause:26:Failure(clause:1:27 / (Logged(vp,vp,<function1>) | Logged(copulaP,copulaP,<function1>)):1:42 ..."the firm p")
-s:0:Failure(s:1:1 / End:1:22 ..."a newspape")
With Success strings:
+s:0
+clause:0
+np:0
+adjP:2
-adjP:2:Failure(adjP:1:3 / adj:1:3 / ws:1:6 / (CharIn(" \t\n.;:?!").rep(1) | &(",") | End):1:6 ..."newspaper ")
+pp:12
-pp:12:Failure(pp:1:13 / prep:1:13 / StringIn("about", "above", "according to", "across", "after", "against", "around", "at", "before", "behind", "below", "beneath", "beside", "besides", "between", "beyond", "by", "by way of", "down", "during", "except", "for", "from", "in", "in addition to", "in front of", "in place of", "in regard to", "in spite of", "inside", "instead of", "into", "like", "near", "of", "off", "on", "on account of", "out", "out of", "outside", "over", "through", "throughout", "till", "to", "toward", "under", "until", "up", "upon", "with", "without"):1:13 ..."reported t")
-np:0:12 Success: "a newspaper "
+vp:12
+vConj:12
-vConj:12:21 Success: "reported "
+pp:21
-pp:21:Failure(pp:1:22 / prep:1:22 / StringIn("about", "above", "according to", "across", "after", "against", "around", "at", "before", "behind", "below", "beneath", "beside", "besides", "between", "beyond", "by", "by way of", "down", "during", "except", "for", "from", "in", "in addition to", "in front of", "in place of", "in regard to", "in spite of", "inside", "instead of", "into", "like", "near", "of", "off", "on", "on account of", "out", "out of", "outside", "over", "through", "throughout", "till", "to", "toward", "under", "until", "up", "upon", "with", "without"):1:22 ..."that the f")
+np:21
+adjP:26
-adjP:26:Failure(adjP:1:27 / adj:1:27 / StringIn("big", "small", "fast", "slow", "new", "old", "next", "red", "blue", "green", "orange", "yellow", "white", "black", "grey", "silver", "gold", "good", "bad", "great", "awful", "cool", "awesome", "worthless", "useful", "clever", "smart", "dumb", "stupid", "ridiculous", "fun", "interesting", "boring", "hungry", "thirsty", "firm"):1:27 ..."the firm p")
-np:21:Failure(np:1:22 / (det.? ~ Logged(adjP,adjP,<function1>).? ~ n.rep(1) ~ Logged(pp,pp,<function1>).? | pronoun):1:22 ..."that the f")
-vp:12:21 Success: "reported "
-clause:0:21 Success: "a newspaper reported "
+clauseConnector:21
-clauseConnector:21:26 Success: "that "
+clause:26
+np:26
+adjP:30
-adjP:30:35 Success: "firm "
+pp:41
-pp:41:Failure(pp:1:42 / prep:1:42 / StringIn("about", "above", "according to", "across", "after", "against", "around", "at", "before", "behind", "below", "beneath", "beside", "besides", "between", "beyond", "by", "by way of", "down", "during", "except", "for", "from", "in", "in addition to", "in front of", "in place of", "in regard to", "in spite of", "inside", "instead of", "into", "like", "near", "of", "off", "on", "on account of", "out", "out of", "outside", "over", "through", "throughout", "till", "to", "toward", "under", "until", "up", "upon", "with", "without"):1:42 ..."new studie")
-np:26:41 Success: "the firm plans "
+vp:41
+vConj:41
-vConj:41:Failure(vConj:1:42 / ("to" ~ ws ~ adv.rep ~ infinitive | "going ".? ~ "to" ~ ws ~ adv.rep ~ infinitive ~ presentParticiple.? | modalAuxiliary ~ adv.rep ~ infinitive | (("do" | "did" | "will") ~ (ws ~ adv).? | ("don't" | "didn't" | "won't") ~ ws) ~ infinitive | have ~ adv.rep ~ pastParticiple | be ~ adv.rep ~ presentParticiple | have ~ adv.rep ~ "been " ~ adv.rep ~ presentParticiple | presentTense | pastTense):1:42 ..."new studie")
-vp:41:Failure(vp:1:42 / vConj:1:42 / ("to" ~ ws ~ adv.rep ~ infinitive | "going ".? ~ "to" ~ ws ~ adv.rep ~ infinitive ~ presentParticiple.? | modalAuxiliary ~ adv.rep ~ infinitive | (("do" | "did" | "will") ~ (ws ~ adv).? | ("don't" | "didn't" | "won't") ~ ws) ~ infinitive | have ~ adv.rep ~ pastParticiple | be ~ adv.rep ~ presentParticiple | have ~ adv.rep ~ "been " ~ adv.rep ~ presentParticiple | presentTense | pastTense):1:42 ..."new studie")
+copulaP:41
-copulaP:41:Failure(copulaP:1:42 / be:1:42 / StringIn("am", "are", "is", "was", "were", "will be", "be", "'m", "'s", "'re", "'ll"):1:42 ..."new studie")
-clause:26:Failure(clause:1:27 / (Logged(vp,vp,<function1>) | Logged(copulaP,copulaP,<function1>)):1:42 ..."the firm p")
-s:0:Failure(s:1:1 / End:1:22 ..."a newspape")
In the bottom version, it is much easier to follow the parse and ID the problem, which is not a Failure object, but the Success object: -np:26:41 Success: "the firm plans "
(i.e. an NP: "plans that are firm", rather than NP "the firm" + V "plans").