Please update version tag for 0.9.1

Parsing

A monadic parser combinator library based on Haskell’s Parsec.

Installation

Install parsing with Spago:

$ spago install parsing

Quick start

Here is a basic tutorial introduction to monadic parsing with this package.

Parsers

A parser turns a string into a data structure. Parsers in this library have the type Parser s a, where s is the type of the input string, and a is the type of the data which the parser will produce on success. Parser s is a monad. It’s defined in the module Parsing.

Monads can be used to provide context for a computation, and that’s how we use them in monadic parsing. The context provided by the Parser s monad is the parser’s current location in the input string. Parsing starts at the beginning of the input string.

Parsing requires two more capabilities: alternative and failure.

We need alternative to be able to choose what kind of thing we’re parsing depending on the input which we encounter. This is provided by the <|> “alt” operator of the Alt typeclass instance of the Parser s monad. The expression p_left <|> p_right will first try the p_left parser and if that fails and consumes no input then it will try the p_right parser.

We need failure in case the input stream is not parseable. This is provided by the fail function, which calls the throwError function of the MonadThrow typeclass instance of the Parser s monad.

To run a parser, call the function runParser :: s -> Parser s a -> Either ParseError a in the Parsing module, and supply it with an input string and a parser. If the parse succeeds then the result is Right a and if the parse fails then the result is Left ParseError.

Primitive parsers

Each type of input string needs primitive parsers. Primitive parsers for input string type String are in the Parsing.String module. For example, the primitive char :: Char -> Parser String Char parser will exactly match one literal character and then advance by one position in the input string.

We can use these primitive parsers to write other String parsers.

Writing a parser

Here is a parser ayebee :: Parser String Boolean which will accept only two input strings: "ab" or "aB". It will return true if the b character is uppercase. It will return false if the b character is lowercase. It will fail with a ParseError if the input string is anything else.

ayebee :: Parser String Boolean
ayebee = do
  _ <- char 'a'
  b <- char 'b' <|> char 'B'
  pure (b == 'B')

We can run the parser ayebee like so

runParser "aB" ayebee

and then the parser will succeed and return Right true.

✨ Run the `ayebee` parser in your browser on Try PureScript!

More parsers

There are other String parsers in the module Parsing.String.Basic, for example the parser letter :: Parser String Char which will accept any single alphabetic letter.

Parser combinators

Parser combinators are in this package in the module Parsing.Combinators.

A parser combinator is a function which takes a parser as an argument and returns a new parser. The many combinator, for example, will repeat a parser as many times as it can. So the parser many letter will have type Parser String (Array Char).

Running the parser

runParser "aBabaB" (many ayebee)

will return Right [true, false, true].

Stack-safety

Starting with v9.0.0, all parsers and combinators in this package are always stack-safe.

Recursion

For the most part, we can just write recursive parsers (parsers defined in terms of themselves) and they will work as we expect.

In some cases like this:

aye :: Parser String Char
aye = char 'a' *> aye

we might get a compile-time CycleInDeclaration error which looks like this:

  The value of aye is undefined here, so this reference is not allowed.


See https://github.com/purescript/documentation/blob/master/errors/CycleInDeclaration.md for more information,
or to contribute content related to this error.

This is happening because we tried to call aye recursively “at a point where such a reference would be unavailable because of strict evaluation.”

The best way to solve this is to stick a Control.Lazy.defer in front of the parser to break the cycle.

aye :: Parser String Char
aye = defer \_ -> char 'a' *> aye

Resources

Monadic Parsers at the Input Boundary (YouTube) by James Brock is an introductory tutorial to monadic parser combinators with this package.
Monadic Parser Combinators by Graham Hutton and Erik Meijer 1996.
The original short classic FUNCTIONAL PEARLS Monadic Parsing in Haskell by Graham Hutton and Erik Meijer 1998.
Parsec: Direct Style Monadic Parser Combinators For The Real World by Daan Leijen and Erik Meijer 2001.
Parse, don't validate by Alexis King is about what it means to “parse” something, without any mention of monads.
Revisiting Monadic Parsing in Haskell by Vaibhav Sagar is a reflection on the Hutton, Meijer FUNCTIONAL PEARL.
Parsec: “try a <|> b” considered harmful by Edward Z. Yang is about how to decide when to backtrack from a failed alternative.
Parser Combinators in Haskell by Heitor Toledo Lassarote de Paula.

There are lots of other great monadic parsing tutorials on the internet.

Related Packages

parsing-dataview primitive parsers for binary parsing of ArrayBuffer.
datetime-parsing parsers for dates and times.
formatters parsers for dates and times and various other things.

Documentation

parsing documentation is stored in a few places:

Module documentation is published on Pursuit.
Written documentation is kept in the docs directory.
Usage examples can be found in the test suite.

If you get stuck, there are several ways to get help:

Open an issue if you have encountered a bug or problem.
Ask general questions on the PureScript Discourse forum or the PureScript Discord chat.

Contributing

You can contribute to parsing in several ways:

If you encounter a problem or have a question, please open an issue. We'll do our best to work with you to resolve or answer it.
If you would like to contribute code, tests, or documentation, please read the contributor guide. It's a short, helpful introduction to contributing to this library, including development instructions.
If you have written a library, tutorial, guide, or other resource based on this package, please share it on the PureScript Discourse! Writing libraries and learning resources are a great way to help this library succeed.

	instance altParserT :: Monad m => Alt (ParserT s m) where
	alt p1 p2 = (ParserT <<< ExceptT <<< StateT) \(s@(ParseState i p _)) -> do
	Tuple e s'@(ParseState _ _ c') <- runStateT (runExceptT (unwrap p1)) (ParseState i p false)
	case e of
	Left _
	\| not c' -> runStateT (runExceptT (unwrap p2)) s
	_ -> pure (Tuple e s')

	-- \| Fail with a message.
	fail :: forall m s a. Monad m => String -> ParserT s m a
	fail message = failWithPosition message =<< position

	-- \| Fail with a message and a position.
	failWithPosition :: forall m s a. Monad m => String -> Position -> ParserT s m a
	failWithPosition message pos = throwError (ParseError message pos)

	-- \| Parse a phrase, without modifying the consumed state or stream position.
	lookAhead :: forall s a m. Monad m => ParserT s m a -> ParserT s m a
	lookAhead p = (ParserT <<< ExceptT <<< StateT) \s -> do
	Tuple e _ <- runStateT (runExceptT (unwrap p)) s
	pure (Tuple e s)

	string :: forall s m. StringLike s => Monad m => String -> ParserT s m String
	string str = do
	input <- gets \(ParseState input _ _) -> input
	case indexOf (wrap str) input of
	Just 0 -> do
	modify_ \(ParseState _ position _) ->
	ParseState (drop (length str) input)
	(updatePosString position str)
	true
	pure str
	_ -> fail ("Expected " <> show str)

	class StringLike s where
	drop :: Int -> s -> s
	indexOf :: Pattern -> s -> Maybe Int
	null :: s -> Boolean
	uncons :: s -> Maybe { head :: Char, tail :: s }

	-- \| Provide a default result in the case where a parser fails without consuming input.
	option :: forall m s a. Monad m => a -> ParserT s m a -> ParserT s m a
	option a p = p <\|> pure a

	instance stringLikeString :: StringLike String where
	uncons = SCU.uncons
	drop = S.drop

	anyChar = do
	input <- gets \(ParseState input _ _) -> input
	case uncons input of

	-- \| Contains the remaining input and current position.
	data ParseState s = ParseState s Position Boolean

purescript-contrib / purescript-parsing Goto Github PK

purescript-parsing's Introduction

Parsing

Installation

Quick start

Parsers

Primitive parsers

Writing a parser

More parsers

Parser combinators

Stack-safety

Recursion

Resources

Related Packages

Documentation

Contributing

purescript-parsing's People

Contributors

Stargazers

Watchers

Forkers

purescript-parsing's Issues

Text.Parsing.StringParser.CodeUnits

Text.Parsing.Parser.String

Data.String.Regex

Environment

Current behavior

Expected behavior

Text.Parsing.Parser

Text.Parsing.Parser.Combinators

Text.Parsing.Parser.String

Text.Parsing.Parser.DataView

Module names

Recommend Projects

Recommend Topics

Recommend Org