akeep / nanopass-framework Goto Github PK

The new nanopass framework; an embedded DSL for writing compilers in Scheme

License: MIT License

Scheme 100.00%

nanopass-framework's Introduction

Nanopass Has Moved

The Nanopass framework for Scheme has moved to its own github group. This one will no longer get updates. You can find the new repo here:

https://github.com/nanopass/nanopass-framework-scheme

There is also a port of this project to Racket, which you can find at:

https://github.com/nanopass/nanopass-framework-racket

Additional information can be found at the nanopass website:

http://nanopass.org

Nanopass Compiler Library

This repositiory contains an R6RS version of the Nanopass Compiler Infrastructure described in [1, 2, 3, 4], along with the beginnings of a test compiler for the library and the rough start to a users guide. The nanopass framework currently supports Chez Scheme, Vicare Scheme, and Ikarus Scheme.

Files

ReadMe.md               -- this readme file
Acknowledgements        -- thanks to those who have supported the work
Copyright               -- copyright information
TODO                    -- the head of the infinite todo list
LOG                     -- change log for the nanopass framework
test-all.ss             -- is a simple wrapper for importing the compiler and 
                           performing a testing run of all of the tests.
nanopass.ss             -- the main interface to the nanopass compiler library
nanopass.chezscheme.sls -- the nanopass compiler library as a Chez Scheme library group
nanopass/               -- contains the parts that nanopass.ss aggregates
tests/                  -- contains a testing compiler along with tests for that
                           compiler and a driver for running the tests
doc/                    -- contains a user guide and developer guide along with a
                           makefile for generating their pdfs with pdflatex
lib/                    -- pre-compiled binaries for use with Petite Chez Scheme
bin/                    -- scripts for managing the pre-compiled binaries

For more information on using the pre-compile binaries, see the README.md file in the lib directory.

References

[1] A. Keep and R. K. Dybvig. A Nanopass Compiler for Commercial Compiler Development. In ICFP ’13: Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming, New York, NY, USA, 2013. ACM.

[2] A. Keep. A Nanopass Framework for Commercial Compiler Development. Doctoral dissertation, Indiana University, Bloomington, Indiana, USA, Feb. 2013.

[3] D. Sarkar. Nanopass Compiler Infrastructure. Doctoral dissertation, Indiana University, Bloomington, Indiana, USA, 2008.

[4] D. Sarkar, O. Waddell, and R. K. Dybvig. A nanopass infrastructure for compiler education. In ICFP ’04: Proceedings of the ninth ACM SIGPLAN International Conference on Functional Programming, pages 201–212, New York, NY, USA, 2004. ACM.

nanopass-framework's People

Contributors

Stargazers

Watchers

Forkers

tufty eholk zhengguan skarayan cgswords rrnewton samth localchart dreamlinuxer yousetme godemo lv88h kstephens haowu3 soegaard caijohn peter-wu fangyuchen86 lsgxeva akkineniramesh cuihz zpzgone yarec jehandadkhan yuhangwang phorkyou yodamaster timxzl patternmap aa10000 jellyr haskellstudio cano159 holm-xie ikingzo1 clojure8 hack-yao zxymika hmrg-grmh rs-store rs-store-personal guagua777

nanopass-framework's Issues

Nanopass uses an excessive amount of memory

I'm seeing the Harlan compiler regularly use about 3GB of memory to do all the Nanopass macro expansions. This is kind of a lot, and it also means we have to use the 64-bit version of Petite or Chez to compile programs. Is there a way to significantly reduce this?

We have a way to extend languages, but not a way to extent passes

We would like to have a way to create a "template" language with a "template" pass, and then extend both the language and the pass to create a concrete pass.

We would like to be able to extend both the existing transformers and the list of definitions, expressed using something like the - and + syntax that we extend passes with.

The idea is to have something like:

(define-pass foo-template : L-template (x) -> L-template ()
  (definitions
     (define f1 (lambda (x) ---))
     (define f2 (lambda (y) ---))
    ---)
  (A : A (x) -> A ()
    [(a) `(a)]
    [(b ,x ,y ,z) `(b ,x ,y ,z)]
    ---)
  (B : B (x) -> B ()
    [(qux ,x) `(qux ,x)]
    ---)
  body)

and modify it as (syntax subject to change):

(define foo : (L (extends L-template)) (x) -> (L (extends L-template)) ()
  (extends foo-template)
  (definitions
    (- (define f2 ...))
    (+ (define f3 (lambda (z) ---)))
  (A : A (x) -> A ()
    (- [(b ,x ,y ,z) ...])
    (+ [(c ,w) `(c ,w)]
        [(d ,n) `(d ,n)]))
  (+ (C : C (x) -> C ()
        ---)
  (- body)
  body)

define-pass should work with a language with no nonterminals.

For testing purposes it is useful to be able to define a language as:

(define-language L)

and a pass as:

(define-pass test-pass : L (x) -> L ())

Right now this produces:

Exception in define-pass: unrecognized non-terminal #f

Mutually recursive nonterminals are unsupported and unchecked

When a language contains mutually recursive language forms, it loops indefinitely, rather than either a) reaching a fixed point or b) raising an error.

For example the trivial language L:

(define-language L (Command (c) e) (Expression (e) c))

never finishes expansion.

My inclination is to raise an error, because I'm not sure how mutually recursive nonterminals should best be handled, particularly in the auto-generation case where we would potentially build mutually recursive procedures in generated passes that loop indefinitely, though hopefully it would work out.

Part of my inclination is also that I'm not sure when it would be useful, though it is possible that we could support it if it was truly useful.

Need a way to define passes that do not auto-generate transformers.

In the current version of the nanopass framework, passes with always auto-generate missing transformers and assume that that is what was intended by the programmer, even if the programmer simply missed a case, or misspecified a cata-morphism.

It should be possible to disable this when the programmer wishes to see instead receive an error for a missing transformer.

(requested, most recently, by Tim Zakian--but also by Kent Dybvig, Eric Holk, etc.)

typo: "detatils" should be "details"

In most of files' header "See the accompanying file Copyright for detatils", "detatils" should be "details".

It appears that the nanopass framework will discard an extra argument in a cata if there is a transformer that matches

Found an example where we had a pass like this:

(define-pass ---
  (Foo : Foo (x) -> Foo ()
     ---)
  (Bar : Bar (x y) -> Bar ()
    [(form ,[foo y -> foo]) `(form ,foo)])

And it was matching. It may also require extra return values, since the example pass had that form.

Failure with dot syntax parsing in passes

Consider the following language, which parses fine

(define-language Lx
  (terminals
   (symbol (x)))
  (Expr (e)
   x
   (lambda (x* ... . x) e)
   (define (x x* ... . x1) e)
   (define x e)))

a totally automated (effectively null) pass works:

(define-pass Px0 : Lx (ir) -> Lx ()
  (Expr : Expr (ir) -> Expr()))

And, using echo-define-pass, expands to

(define Px0
  (lambda (ir)
    (define who 'Px0)
    (define-nanopass-record)
    (define Expr
      (lambda (ir)
        (let ([#{g104 o3tk3lwitutvs130etsr6d-140} ir])
          (let-syntax ([quasiquote '#<procedure>]
                       [in-context '#<procedure>])
            (begin
              (let ()
                (cond
                  [(symbol? #{g104 o3tk3lwitutvs130etsr6d-140})
                   #{g104 o3tk3lwitutvs130etsr6d-140}]
                  [else
                   (let ([tag (nanopass-record-tag
                                #{g104 o3tk3lwitutvs130etsr6d-140})])
                     (cond
                       [(eqv? tag 1)
                        (make-Lx:lambda:Expr.1175 'Px0
                          (Lx:lambda:Expr.1175-x*
                            #{g104 o3tk3lwitutvs130etsr6d-140})
                          (Lx:lambda:Expr.1175-x
                            #{g104 o3tk3lwitutvs130etsr6d-140})
                          (Expr
                            (Lx:lambda:Expr.1175-e
                              #{g104 o3tk3lwitutvs130etsr6d-140}))
                          "x*" "x" "e")]
                       [(eqv? tag 2)
                        (make-Lx:define:Expr.1176 'Px0
                          (Lx:define:Expr.1176-x
                            #{g104 o3tk3lwitutvs130etsr6d-140})
                          (Lx:define:Expr.1176-x*
                            #{g104 o3tk3lwitutvs130etsr6d-140})
                          (Lx:define:Expr.1176-x1
                            #{g104 o3tk3lwitutvs130etsr6d-140})
                          (Expr
                            (Lx:define:Expr.1176-e
                              #{g104 o3tk3lwitutvs130etsr6d-140}))
                          "x" "x*" "x1" "e")]
                       [(eqv? tag 3)
                        (make-Lx:define:Expr.1177 'Px0
                          (Lx:define:Expr.1177-x
                            #{g104 o3tk3lwitutvs130etsr6d-140})
                          (Expr
                            (Lx:define:Expr.1177-e
                              #{g104 o3tk3lwitutvs130etsr6d-140}))
                          "x" "e")]
                       [else
                        (error 'Px0
                          "unexpected Expr"
                          #{g104 o3tk3lwitutvs130etsr6d-140})]))])))))))
    (let ([x (Expr ir)])
      (unless ((lambda (x) (or (Lx:Expr.1174? x) (symbol? x))) x)
        (error 'Px0 (format "expected ~s but got ~s" 'Expr x)))
      x)))

All well and good. However, if I wish to transform (define (x x* ... . x1) e) into (define x (lambda ....)) form, I need to manually specify the transformation. The following pass would appear to be correct:

(define-pass Px1 : Lx (ir) -> Lx ()
  (Expr : Expr (ir) -> Expr()
        [(define (,x ,x* ... . ,x1) ,[e])
         `(define ,x (lambda (,x* ... . ,x1) ,[e]))]))

However, I get the following error :

Exception in meta-parse-term: invalid pattern or template ((unquote x) (unquote x*) ... unquote x1)

The converse transformation, for example the following (contrived) pass, works fine

(define-pass Px2 : Lx (ir) -> Lx ()
  (Expr : Expr (ir) -> Expr ()
    [(define ,x ,[e])
     `(lambda (,x . test) ,e)]))

(unparse-Lx (Px2 (parse-Lx '(define foo bar))))
(lambda (foo . test) bar)

I can see why it's happening, but I can't see an easy way around it without adding syntax to the specification of match patterns.

Issue with booleans

I may be misunderstanding something here, but I'm having trouble with languages that incorporate boolean values. A totally barebones example follows:

(define-language test-l
(terminals (boolean (b)))
(Expr (e) b))
(define-parser parse-test-l test-l)
(parse-test-l '#t)

t

(parse-test-l '#f)
Exception in parse-test-l: invalid syntax #f
Type (debug) to enter the debugger.

It seems to me that the issue comes from make-term-clause returning just the s-exp on success, and that s-exp being used to indicate success within make-parse-proc (lines 46 and 96 of parser.ss, unless my macro-fu has left me totally)

Regards

Simon

Documentation still specifies parse-L*

The current user-guide PDF still says that DEFINE-LANGUAGE should define a PARSE-lang function.

Pointer to new repository?

I might suggest a pointer to the new repository, the way there is for the Racket Nanopass Compiler Library?

Extended languages should inherit the start nonterminal by default

I very rarely change the start symbol for a language, and instead change some of the non-terminals that occur more towards the leaves. This means I have to explicitly give the start symbol for extended languages, but it would be nice if it would just use the start symbol from before unless I specifically change it.

Why no top-level cata if the full "in -> out" cata syntax is used?

We were playing around with a tree example, where we stuck in extra non-terminals for the heck of it:

(define-language LTree2
  (entry Tree)
  (Tree (t) n l)
  (Node (n) (node t0 t1))
  (Leaf (l) (leaf i))
  (terminals (number (i))))

We wanted to be able to handle the Tree non-terminal with patterns like this:

[,[n -> h] h]
[,[l -> h] h]

These currently result in a "no top-level cata" error. But, would it be possible to permit this? The "n" and "l" identify the target for the cata right? Pattern matching on ,n and ,l without catas works fine.