Coder Social home page Coder Social logo

hhy5277 / apg-js2-exp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ldthomas/apg-js2-exp

0.0 1.0 0.0 746 KB

A pattern-matching engine similar to RegExp but uses an ABNF pattern syntax and APG parsers.

Shell 0.01% JavaScript 86.17% CSS 2.49% HTML 11.33%

apg-js2-exp's Introduction

apg-exp - APG Expressions

apg-exp is a regex-like pattern-matching engine that uses a superset of the ABNF syntax for the pattern definitions and APG to create and apply the pattern-matching parser.

Tutorial: Don't miss the tutorial on sitepoint.com. It will walk you through the basics from simple to some fairly sophisticated pattern matching of nested, paired parentheses and other brackets. (Something you can't do with RegExp.) It's all laid out for you with nine (9), hands-on, CodePen examples.

Complete User's Guide: A complete user's guide can be found at ./guide/index.html or the APG website.

v2.1.0 release notes: There are no functional changes in version 2.1.0. Its dependency on apg has been modified to depend instead on the new apg API, apg-api. This removes all dependency on the node.js file system module "fs". Some development frameworks are incompatible with "fs".

apg-exp: By way of introduction, the regex Wikipedia article would be a good start and Jeffrey Friedl's book, Mastering Regular Expressions would be a lot better and more complete. This introduction will just mention features, a little on motivation and try to point out some possible advantages to apg-exp.

Features:

  1. The pattern syntax is a superset of ABNF (SABNF.) The ABNF syntax is standardized for and used to describe most Internet technical specifications.
  2. APG provides error checking and analysis for easy development of an accurate syntax for the desired pattern.
  3. Pattern syntax may be input as SABNF text or as an instantiated, APG parser object.
  4. Gives the user complete control over the pattern's character codes and their interpretation.
  5. Easy access to the full UTF-32 range of Unicode is provided naturally through the integer arrays that make up the character-coded strings and phrases.
  6. Results provide named access to all matched sub-phrases and the indexes where they were found, not just the last matched.
  7. Results can be returned as JavaScript strings or raw integer arrays of character codes.
  8. Global and "sticky" flags operate nearly identically to the same-named JavaScript RegExp flags.
  9. Recursive patterns are natural to the SABNF syntax for easy pair matching of opening and closing parentheses, brackets, HTML tags, etc.
  10. Fully implemented lookaround – positive and negative forms of both look-ahead and infinite-length look-behind.
  11. Back referencing – two modes, universal and parent. See the definitions in the SABNF documentation. For example, parent mode used with recursion can match not only the opening and closing tags of HTML but also the tag names in them. (See the back reference example.)
  12. Word and line boundaries are not pre-defined. By making them user-defined they are very flexible but nonetheless very easy to define and use. The user does not have to rely on or guess about what the engine considers a boundary to be.
  13. Character classes such as \w, \s and . are not pre-defined, providing greater flexibility and certainty to the meaning of any needed character classes.
  14. The syntax allows APG's User-Defined Terminals (UDTs) – write your own code for special phrase matching requirements. They make the phrase matching power of apg-exp essentially Turing complete.
  15. Provides the user with access to the Abstract Syntax Tree (AST) of the pattern match. The AST can be used for complex translations of the matched phrase. (See the dangling-else example.)
  16. Provides the user with access to APG's trace object which gives a complete, step-by-step picture of the parser's matching process for debugging purposes.
  17. A very flexible replacement function for replacing patterns in strings.
  18. A split function for using patterns to split strings.
  19. A test function for a quick yes/no answer.
  20. Tree depth and parser step controls to limit or "put the brakes on" an exponential or "catastrophic backtracking" syntax.
  21. Numerous display functions for a quick view of the results as text or HTML tables.

Introduction:
The motivation was originally twofold.

  1. I wanted to replace the pattern syntax with ABNF, which to me at least, is much easier to read, write and debug than the conventional regex syntax.
  2. I felt (mistakenly) that a recursive-descent parser like APG would prove to be much more a powerful pattern matcher than regular expressions.

Hardly any programmer has not needed regexes at some point, more likely lots of points, and it doesn't take much reading of the Internet forums to note that many others, like me, find the regex syntax to be quite cryptic. Additionally, because regexes have such a long, rich history with many versions from many (excellent) developers, there are many different syntax variations as you move from system to system and language to language. By contrast ABNF is standardized (although my non-standard superset additions are starting to pile up.) Whether or not the ABNF syntax is preferable to conventional regex syntax will always be a personal preference. But, for me and possibly others, ABNF offers a more transparent syntax to work with.

At the outset I naively thought that the regular expressions of regexes were just that – the Chomsky hierarchy variety. Therefore, I thought that using an APG parser for the pattern matching would add a great deal of parsing power to the problem. I soon discovered that not only were regexes not real "regular expressions", they were powerful, recursive-descent parsers, loaded with features that went well beyond that of APG. I had to play a little catch up to add look behind, back referencing and anchors. That being done, however, I think there is still a case for claiming some added power. I'm not a regex expert and I won't be making any big claims here, but there are a couple of points I will mention. I think the way that apg-exp gives the user nearly full control over the input, output and interpretation of the character codes goes a long way to address a number of the cautions mentioned in Jeffrey Friedl's book, for example on pages 92 and 106. I also think it addresses a number of the things Larry Wall finds wrong with the regex culture in his Apocalypse 5 page. For example, back referencing, support for named capture, nested patterns (recursive rules), capture of all matches to a sub-phrase and others.

But the best thing to do, probably, is to head over to the examples and take a look. See and compare for yourself. I would suggest starting with the flags, display and rules examples to get your bearings and go from there.

Installation:
GitHub: In your project directory,

git clone https://github.com/ldthomas/apg-js2-exp.git apgexp
npm install apgexp --save

npm: In your project directory,

npm install apg-exp --save

web page:

git clone https://github.com/ldthomas/apg-js2-exp.git apgexp

Then, in the header of your web page include,

<link rel="stylesheet" href="./apgexp/apgexp.css">
<script src="./apgexp/apgexp.js" charset="utf-8"></script>

or,

<link rel="stylesheet" href="./apgexp/apgexp-min.css">
<script src="./apgexp/apgexp-min.js" charset="utf-8"></script>

(Note that some apg-exp output is in HTML format and apgexp.css is needed to properly style it. Also, it is simply a copy of apglib.css.)

Now access apg-exp as,

<script>
var exp = new ApgExp(pattern);
</script>

See, specifically, the email example.

Examples:
See apg-js2-examples/apg-exp for many more examples of using apg-exp.

Documentation:
The full documentation is in the code in docco format. To generate the documentation, from the package directory:

npm install -g docco
./docco-gen

View docs/index.html in any web browser to get started. Or view it on the APG website

Copyright:
Copyright © 2017 Lowell D. Thomas, all rights reserved

License:
Released under the BSD-3-Clause license.

apg-js2-exp's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.