Coder Social home page Coder Social logo

Comments (8)

wavebeem avatar wavebeem commented on August 19, 2024 1

Thanks for the comment @anko, always nice to see you in these Parsimmon threads :)

@tstodter Thanks for the kind words! I would definitely suggest using lookahead combined with then as @anko suggested. Also, you have a missing detail in your "all lines are 94 characters" parser. You need to add .then(P.eof) to test that you hit the end of the file. .many() matches ZERO or more times, so if this parser fails to parse, it will just parse zero times and succeed.

I think I see what you're mentioning about the result.furthest = -1 thing... I'll do some investigation on this and see what I find out.

I made an example here https://runkit.com/wavebeem/parsimmon-issues-296/5.0.0 and I was surprised by the 2nd output... I think you may be correct and this might be a bug.

from parsimmon.

anko avatar anko commented on August 19, 2024

(I'm not the maintainer, just some person who digs combinatory parsing libraries.)

I had hoped to implement this as const both = (parser1, parser2) => P.lookahead(parser1).chain(() => parser2) or just P.lookahead(parser1).then(parser2), […]

Trying a minimal example:

const P = require('parsimmon')
const both = (parser1, parser2) => P.lookahead(parser1).then(parser2)

const a = P.regex(/[aA]/).desc('a or A')
const caps = P.regex(/[A-Z]/).desc('capital letter A-Z')

const aCaps = both(a, caps)

console.log(aCaps.parse('A')) // both should succeed
console.log(aCaps.parse('b')) // a should fail
console.log(aCaps.parse('a')) // caps should fail

Output:

{ status: true, value: 'A' }
{
  status: false,
  index: { offset: 0, line: 1, column: 1 },
  expected: [ 'a or A' ]
}
{
  status: false,
  index: { offset: 0, line: 1, column: 1 },
  expected: [ 'capital letter A-Z' ]
}

In each failure, the expected array corresponds to the failing parser. This seems to not match your observation though:

If parser1 succeeds and parser2 fails, parser produced by both will fail, but the final result will have the expected array of the first parser and it loses the much-more-informative error of the second parser.

Does my example match what you're trying to do in principle? If so, I think there must be something else going on.

from parsimmon.

tstodter avatar tstodter commented on August 19, 2024

Hey @anko, thanks for thinking on this. The following is closer to what I'm actually working with, and does seem to reproduce what I'm seeing:

const both = (parser1, parser2) => P.lookahead(parser1).then(parser2)

const par = both(
  P.regex(/[a-z0-9]/).times(2).skip(P.string(',')).many().skip(P.eof),
  P.seq(
    P.letters.skip(P.string(',')),
    P.digits.skip(P.string(',')),
    P.letters.skip(P.string(',')),
    P.eof
  )
);

console.log(par.parse('ab,12,cde,'));  // Should fail only the first parser in both()
console.log(par.parse('ab,12,c2,'));  // Should fail only the second parser in both()

That spits out

{
  status: false,
  index: { offset: 8, line: 1, column: 9 },
  expected: [ "','" ]
}
{
  status: false,
  index: { offset: 9, line: 1, column: 10 },
  expected: [ '/[a-z0-9]/' ]
}

If I just run that second parser (starting with P.seq) on 'ab,12,c2,', the result is

{
  status: false,
  index: { offset: 7, line: 1, column: 8 },
  expected: [ "','" ]
}

which is what I would expect. The expected: [ '/[a-z0-9]/' ] above is surprising to me, since that string should be passing the first parser but that expected is clearly coming from the first parser. Kind of shooting in the dark, but could some backtracking after the failure of parser2 be producing the different results?

from parsimmon.

wavebeem avatar wavebeem commented on August 19, 2024

Hmm, furthest should only be -1 when your parser succeeds. So we shouldn't assign .furthest = -1 inside lookahead. I'll get back to y'all when I learn more.

from parsimmon.

wavebeem avatar wavebeem commented on August 19, 2024

OK so I made the smallest possible (I think) repro of this bug: https://runkit.com/wavebeem/parsimmon-issues-296/6.0.0

I'm not sure if there's any way to fix this bug without breaking... well probably everybody's parsers. The issue seems to be inside of mergeReplies and how it handles the expected array.

from parsimmon.

tstodter avatar tstodter commented on August 19, 2024

It does seem like a fix wouldn't change the final success/failure status, though it would change any error messages, so perhaps that's a small favor. That .then implementation of both does work just fine for me if both parsers succeed, it's just the lost failure information.

Do you think it's correct to write a version of lookahead into my project that sets result.furthest = -1; for successful results? That -1 does give me expected behavior from mergeReplies (since if (result.furthest > last.furthest) passes).

from parsimmon.

wavebeem avatar wavebeem commented on August 19, 2024

sure, i mean you could totally copy the function and modify it for use in your own code:

function lookahead(x) {
  if (isParser(x)) {
    return Parsimmon(function(input, i) {
      var result = x._(input, i);
      result.index = i;
      result.value = "";
      result.furthest = -1; // hack
      return result;
    });
  } else if (typeof x === "string") {
    return lookahead(string(x));
  } else if (x instanceof RegExp) {
    return lookahead(regexp(x));
  }
  throw new Error("not a string, regexp, or parser: " + x);
}

You could also use new Parsimmon(fn) to make a custom parser that uses another parser inside it.

https://runkit.com/wavebeem/parsimmon-issues-296-alternative/2.0.0

from parsimmon.

wavebeem avatar wavebeem commented on August 19, 2024

and yeah, "fixing" this should only change the error messages, but i found lots of tests that returned empty (!!) "expected" arrays for parse failures, which is clearly not a good thing

from parsimmon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.