Coder Social home page Coder Social logo

grammar-ruby's Introduction

Grammar

This is an attempt to make a cross-parser grammar specification for parsers written in Ruby. The result is a very opiniated gem that is cranky about the sad state of grammar languages. Grammar doesn't have a grammar file format, or even care what grammar file you use, so long as it can be translated into Grammar objects. The Grammar object hierarchy attempts to support all of the features supported by all of the major parsers. If you create a grammar using features that aren't supported by your favorite parser, then it's up to that parser to complain about the grammar you tried to feed it.

Grammar doesn't attempt to mitigate left-recursive alternations; a parser is expected to either handle the grammar properly, or to report an appropriate error for any grammar element that it can't handle.

Grammars are Modules

Grammars are Ruby Modules, and the grammar rules are constants defined in the grammar module.

grammar :MyGrammar do
    MyRule = ...
end

or, equivalently...

module MyGrammar
    using Grammar::DSL
    MyRule = ...
end

Matches are instances

In the land of Grammar, grammar elements (sometimes called 'parslets') are subclasses of the primary Grammar classes (ie. Alternation, Concatenation, etc). Matches resulting from parsing input that matches the grammar are instances of the subclasses. For example:

foo_bar = Grammar::Alternation.with('foo', 'bar')	# => Class:0xXXXXX
...
parser.parse('foo')					# => #<Grammar::Alternation @match='foo'>

Grammar Elements

A grammar is a tree of grammar elements that closely mirror the typical BNF, PEG, etc grammars.

Reference

A Reference element can be used to refer to a grammar element that will be defined later in the module, but needs to be used before it can be defined. Typically references are used in recursive grammars, and are therefore very similar to the Recursion element. However, not all elements in a recursive chain are actually recursive, and sometimes it's simply more convenient to forward-declare an element.

Latch

A Latch is a type of Reference that closely resembles the "back references" of regular expressions. When a Latch first matches its pattern, it saves (i.e. latches) the matching input. Subsequent uses of the Latch must then match the latched input.

Note that Latches are treated as a sort of context variable in that they are attached to a particular parsing context. When a parser enters a context it resets all of the Latches in that context.

QuotedString = concatenation do
    latch(:quote) { /['"]/ }    # Create a Latch and attach it to the QuotedString context

    element quote   # First use. This will match a single- or double-quote
    element /\w*/   # Match the string
    element quote   # Second use. Matches whatever the first use matched.
end

Whitespace

Typically, grammar languages implicitly ignore whitespace. In those cases, whitespace is defined according to some internal pattern.

In Grammar, whitespace isn't automatically ignored. However, you can explicitly specify a pattern to be ignored.

module MyGRammar
    using Grammar::DSL

    ignore /\s*/        # Ignore whitespace
    ...
end

Recursion

Most uses of recursion in typical parser grammars are actually a hack for implementing repetition. In Grammar, repetition is represented explicitly using the Repetition class, which eliminates most of the common use cases. However, that still leaves a few cases to be handled. The Parenthesis Language being a notable example.

When using the DSL to specify a grammar module, the relevant methods will attempt to detect common uses of recursion and handle them appropriately.

Named Recursion vs Anonymous Recursion

Grammar provides two syntaxes for creating a recursive element: named and anonymous. Named recursion creates a named constant in the enclosing lexical scope and assigns the resulting recursion proxy object to it. Anonymous recursion passes the proxy object to the block as the first block parameter.

# Named recursion
alternation :Rule0 do
    # Reference the recursion using the Rule0 constant
    ...
end

# Anonymous recursion
alternation do |rule0|
    # Reference the recursion using the rule0 block parameter
    ...
end

Alternation

All recursive alternations are effecively left-recursive from the perspective of a top-down parser, and may be considered left-recursive by bottom-up parsers (depending on the specific implementation). And we all know how that turns out. Fortunately, Grammar knows that you really wanted a repetition with at least one repeat, and will give you what you wanted instead of what you asked for.

Rule0 = Grammar::Alternation.with('abc', 'def', Rule0)	# => NameError('You are a bad person')

If you use the DSL form, you'll get something more reasonable (ie. a repeating alternation).

grammar :MyGrammar do
    alternation :Rule0 { elements 'abc', 'def', Rule0}	#=> Grammar::Repetition.at_least(1, Grammar::Alternation.with('abc', 'def'))
end

But, you could go easy on yourself and just do it the right way...

Rule0 = Grammar::Alternation.with('abc', 'def').at_least(1)

See how easy that is?

Left Recursion

If your grammar does this, you either meant to use repetition, or you should be ashamed of yourself.

Rule0 = Grammar::Concatenation.with(Rule0, ')')		# => NameError('Again?')

If you use the DSL it will try to save you from yourself:

grammar :MyGrammar do
    concatenation :Rule0 { elements Rule0, ')' }	# => Rule0 = Grammar::Repetition.any(')')
end

But you should really just use repetition the way it was meant to be used instead of trying to fake it with recursion.

Rule0 = Grammar::Repetition.any(')')

Right Recursion

Again, if you're doing this you should be using repetition.

Rule0 = Grammar::Concatenation.with('(', Rule0)		# => NameError('Seriously?')

And once again, the Grammar DSL will clean up your mess:

grammar :MyGrammar do
    concatenation :Rule0 { elements '(', Rule0 }	# => Rule0 = Grammar::Repetition.one_or_more('(')
end

But seriously. Just stop.

Center Recursion

The only element in Grammar that can support center recursion is Concatenation, and the only way to do it is with the DSL (but you already knew that).

grammar :MyGrammar do
    concatenation :Rule0 do
    	elements '(', Rule0, ')'
    end
end

The Grammar DSL will handle this by creating a Recursion proxy object instead of the Concatenation that you thought you wanted. It's then up to any parser employing the grammar to map the Recursion proxy to whatever mechanism said parser uses for handling center-recursion. If the parser can't handle center-recursion, then it's expected to report an appropriate error.

There's also the case of the "odd number of x's"

Rule0 = Grammar::Alternation.with(Grammar::Concatenation.with('x', Rule0, 'x'), 'x')

which is normally meant to repesent a repetition that's constrained to an odd number of repeats greater than 2. This can be represented in Grammar with

Rule0 = Repetition.with('x', minimum:3, &:odd?)

Outer Recursion

Many grammar languages use some very odd tricks for doing various things, and this is one that tends to be used to represent an odd number of some item.

grammar :MyGrammar do
    concatenation :Rule0 do
        elements Rule0, ',', Rule0
    end
end

Someday, Grammar will rewrite this into something a bit less insane, but for now it just gives you an outer-recursive concatenation, as requested.

Nested Outer Recursion

This particular use of recursion only happens when a recursive Concatenation is nested inside of an Alternation that contains elements other than the recursion-causing Concatenation. Typically, grammar writers use this trick to represent a separated list of items.

Grammar will try to help you out by replacing the outer-recursion with a repeating right-recursion, thereby eliminating the left-recursion.

# This is bad
alternation :Rule0 do
    element 'abc'
    element 'def'
    element concatenation { elements Rule0, ',', Rule0 }
end

# This is what it becomes
Rule0 = Grammar::Concatenation.with(
            Grammar::Alternation.with('abc', 'def'),
            Grammar::Concatenation.with(',', Rule0).any
        )

Installation

Add this line to your application's Gemfile:

gem 'grammar'

And then execute:

$ bundle

Or install it yourself as:

$ gem install grammar

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/bfoz/grammar.

grammar-ruby's People

Contributors

bfoz avatar

Watchers

 avatar

grammar-ruby's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.