Coder Social home page Coder Social logo

highlight-tree-sitter's Introduction

Highlight Tree Sitter

A low-level API layer for node-tree-sitter to:

  • generate HTML snippets of syntax-highlighted source code, using Atom's scope mappings format
  • pretty-print partial/full s-expressions from tree-sitter syntax trees for readability/learning
  • produce your own tree-sitter artifacts by renaming/flattening tree-sitter nodes:
    • (e.g. terminal color output?, or turning recognized symbols into links? whatever you want 🙂)

Background:

Tree-sitter is built for realtime parsing, why use it for static text? Tree-sitter is built for syntax-highlighting, and its promise is to have better error-tolerance than other solutions. If better grammars are developed for it, we should be able to use them for highlighting text outside of editors too.

Run the demo

Run demo.js to see the following example for highlighting JavaScript:

npm install
node demo.js

Learn!

Code below is a walkthrough of what the above demo does

Suppose we have the following JavaScript code we want to highlight:

function foo() {
  return 1;
}

Partial Tree: Passing it to partialSexp will create the partial tree seen below—not containing any actual source text, and only displaying what are called named nodes, giving you an overview of the syntax tree.

NOTE: This s-expression format is what tree-sitter uses in its own test cases, but we provide a facility to represent it as arrays and to print it with proper formatting using printSexp, which is used in these examples)

(program
  (function
    (identifier)
    (formal_parameters) 
    (statement_block (return_statement (number)))))

Full Tree: Passing it to fullSexp will instead create a full tree, with source text and whitespace with a root node _root capturing outer whitespace, and anonymous nodes _anon capturing what tree-sitter calls unnamed nodes.

(_root
  "\n"
  (program
    (function
      (_anon "function")
      " "
      (identifier "foo")
      (formal_parameters (_anon "(") (_anon ")"))
      " "
      (statement_block
        (_anon "{")
        "\n  "
        (return_statement (_anon "return") " " (number "1") (_anon ";"))
        "\n" 
        (_anon "}")))))

Annotated Tree: Passing the full tree to highlightSexp with Atom's javascript grammar scopes (see scope mappings) produces the tree below. Each syntax node is annotated with matching class names from the scope mappings:

(_root
  "\n"
  (program.source.js
    (function
      (_anon.storage.type "function")
      " "
      (identifier.entity.name.function "foo")
      (formal_parameters
        (_anon.punctuation.definition.parameters.begin.bracket.round "(")
        (_anon.punctuation.definition.parameters.end.bracket.round ")"))
      " "
      (statement_block
        (_anon.punctuation.definition.function.body.begin.bracket.curly
          "{")
        "\n  "
        (return_statement
          (_anon.keyword.control "return")
          " "
          (number.constant.numeric "1")
          (_anon ";"))
        "\n"
        (_anon.punctuation.definition.function.body.end.bracket.curly
          "}")))))

Highlight Tree: Since we do not need any unannotated syntax nodes, we create a new tree with only the highlighted nodes, flattening all others:

(_root
  "\n"
  (program.source.js
    (_anon.storage.type "function")
    " "
    (identifier.entity.name.function "foo")
    (_anon.punctuation.definition.parameters.begin.bracket.round "(")
    (_anon.punctuation.definition.parameters.end.bracket.round ")")
    " "
    (_anon.punctuation.definition.function.body.begin.bracket.curly "{")
    "\n  "
    (_anon.keyword.control "return")
    " "
    (number.constant.numeric "1")
    ";\n"
    (_anon.punctuation.definition.function.body.end.bracket.curly "}")))

HTML output: We can then directly map the highlight tree s-expressions to html span tags below:

<span class="source js"><span class="storage type">function</span> <span class="entity name function">foo</span><span class="punctuation definition parameters begin bracket round">(</span><span class="punctuation definition parameters end bracket round">)</span> <span class="punctuation definition function body begin bracket curly">{</span>
  <span class="keyword control">return</span> <span class="constant numeric">1</span>;
<span class="punctuation definition function body end bracket curly">}</span></span>

API

For the following signatures, tree is the output of tree-sitter parser on text, and sexp is nested array of strings and arrays (s-expressions).

  • partialSexp(tree) => sexp - create partial s-expression from tree (no text or unnamed nodes)
  • fullSexp(text, tree) => sexp - create full s-expression from source text and tree
  • printSexp(sexp) => str - pretty-print an s-expression

Highlighting:

  • highlightSexpFromScopes(sexp, scopes) => { html, sexp } - highlight using Atom scope mappings

Dev

The s-expression pretty-printer is compiled ClojureScript code. To rebuild:

npm run build

highlight-tree-sitter's People

Contributors

shaunlebron avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

rowanmcdonald

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.