sampsyo / bril Goto Github PK

an educational compiler intermediate representation

Home Page: https://capra.cs.cornell.edu/bril/

License: MIT License

TypeScript 9.48% Python 8.03% Makefile 1.21% Rust 38.91% OCaml 3.67% Standard ML 0.03% Shell 0.28% Swift 2.62% C 18.64% Vim Script 0.62% Emacs Lisp 0.15% TeX 1.51% Awk 0.04% C++ 14.81%

bril compiler programming-language

bril's People

Contributors

Stargazers

Watchers

Forkers

avanhatt checkmate50 wbthomason dz333 0x486f626f sgpthomas blue9 sameerlal anastos dan12 sa2257 seanlatias q-hwang tissue3 orichardson liuhenry4428 chrisroman rolph-recto bhargee pbb59 epeguero neiladit xu3kev hackedy g-c-c cookieli yangdanny97 jasonews pat-lafon goki0607 calsign sibylau evanbenadler mdmoeller priyasrikumar kq-li socrateswong ankushrayabhari hj424 stp59 eclecticgriffin amsharifian lijunchen tedbauer nwtnni terrelln jatinx yati-sagade alpha-convert joshpoll cgyurgyik bikz05 pranavgour rethab zhuguangxiang syvb femtomc zeta1999 wzrumich bkrmendy passaro rsenapps jacksonfellows fyquah dmitrysoshnikov intensifier saurabhjha ruyk miguelraz sebastianfish 2bjake allenanswerzq leandron lanza ravil-mobile isgasho benjaminshinar doppioandante 0xd3xt3r zihaoli-cn alaiasolkobreslin chhzh123 zzzdavid andrewb1999 tonyjie michaelmaitland gsvic 5hubh4m atucker charles-rs barabanshek ayakayorihiro orkosinha jonathandltran andreyyao yy665 susan-garry frankfanslc liusy58 zbcszr

bril's Issues

Implementing the speculative execution extension for brilirs

The current feature difference between the reference interpreter, brili, and its Rust counter-part brilirs is that brilirs does not implement the speculative execution extension. The implementation semantics can probably be borrowed from brili and converted to rust. However, brilir uses iterators along basic block sequences instead of random access of the entire function which is more performant but may be an issue when trying to commit, and then return to an instruction part of the way through a code block. Use of the speculation extension is rare, so it would be nice to do this with a limited performance penalty on the longer running benchmarks.

Sieve appears to have an invalid ptr type!

Hi,

It looks like on line 27 of sieve.bril there is a reference to a ptr<book>... not sure that this is valid Bril!

Separate TypeScript tools from frontend

It would be nice if bril-ts, including the language definition and builder, could be used independently of the TypeScript compiler and possibly also the interpreter. This would help facilitate independent projects written in TS that want to depend on our utilities, and it would set a precedent for other similar independent language definitions/tools in other languages.

LVN should not crash on divide-by-zero

The current implementation of constant folding via local value numbering in lvn.py crashes when the program divides by zero. Instead of crashing, the compiler should bail on folding constants when there are dynamic errors.

Minimal example:

void main() {
entry:
  zero : int = const 0;
  one : int = const 1;
  baddiv : int = div one zero;
  print baddiv;
}

Stack trace:

Traceback (most recent call last):
  File "examples/lvn.py", line 243, in <module>
    lvn(bril, '-p' in sys.argv, '-c' in sys.argv, '-f' in sys.argv)
  File "examples/lvn.py", line 236, in lvn
    fold=_fold if fold else lambda n2c, v: None,
  File "examples/lvn.py", line 160, in lvn_block
    const = fold(num2const, val)
  File "examples/lvn.py", line 211, in _fold
    return FOLDABLE_OPS[value.op](*const_args)
  File "examples/lvn.py", line 197, in <lambda>
    'div': lambda a, b: a // b,
ZeroDivisionError: integer division or modulo by zero

Semantics of calling print on a pointer?

Consider:

@main {
  five: int = const 5;
  x: ptr<int> = alloc five;
  print x;
  free x;
}

Currently, brili outputs [object Object] and brilirs outputs Pointer { base: 0, offset: 0 }, essentially just spitting out their internal representations. Given that https://www.cs.cornell.edu/courses/cs6120/2019fa/blog/manually-managed-memory/ says that "the representation of data in memory is abstract", this seems like not great behavior.

I was wondering if this behavior should be further specified? One could of course just keep it implementation defined, it could be "bad program" for which a nice interpreter would return an error, or it could print out the contents of the array either just at that index or from there untill the end(Which a couple benchmarks have written implementations for so this is desirable behavior but requires a compliant implementation to know the bounds of the array).

No "Operations" section in "Syntax" page

There is the following text in https://capra.cs.cornell.edu/bril/lang/syntax.html#instruction section:

op, a string: the opcode that determines what the instruction does. (See the Operations section, below.)

But there is no "operations" section so the link leads to nowhere. And therefore there is no place where all operations are described.

Define an "official" extension mechanism

Bril is supposed to be extensible, but that extensibility is currently pretty ad hoc: you just add whatever you want to the language. It would be nice to codify some standard way to package up extensions to Bril. There wouldn't need to be a machine-readable description of what an extension consists of, but maybe there should be a way in a Bril program to describe the extensions that the program relies on.

This would require that extensions get assigned (string) names. We would probably want to organize the documentation around a minimal "base" language and individual extensions.

Add type checker to GitHub Actions

Most of the tests we run in CI rely on the reference interpreter, which is (by design) extremely permissive. Plenty of type errors are forgiven, for example. As @Pat-Lafon has valiantly pointed out lately, however (e.g., #154 (comment) and #152 (comment)), this makes it easy for hand-written code with type errors to sneak into the benchmarks and other tests.

It would be awesome to set up automated checking in CI for all the Bril programs we have in tree. We could do this with brilirs's own --check mode or the type inference tool.

Bonus points for pointing out the line where the error occurs and adding a check annotation on the offending line in a given PR… but that can come in a subsequent phase!

Show: Bril Playground

I've created a web playground for Bril to visualize the CFG and dominators. And it embeds the interpreter. I am also open to suggestions on what can be added/changed.

https://agentcooper.github.io/bril-playground/

Specify divide-by-zero semantics

The Bril docs don't say anything about what divide-by-zero should do, either in the core instructions or in the floating point extension. Options include:

Undefined behavior.
Evaluate to zero (or NaN in the FP extension).
Must crash with an exception.

Crashing with an exception is in some ways the most reasonable thing to do (and models many real languages), but it has two drawbacks:

Exceptions are otherwise not a thing in Bril, so this would be a new concept.
It makes basic manipulations more complex because you have to worry about potential side effects for div and fdiv instructions.

Of course, you can see the latter disadvantage as an advantage, pedagogically speaking… since it introduces an extra nuance, especially for speculative optimizations, that otherwise wouldn't exist in the language.

Start a benchmark suite

Let's assemble a ready-made Bril benchmark suite for measuring the impact of performance. Some places to start:

The benchmarks from this GVN project
Also this one on loop optimization
bril-benchmark
Port some simple Embench benchmarks for realism

DOM algorithm fails if there's a unreachable basic block

bril/examples/dom.py

Line 52 in 57877b1

nodes = list(reversed(postorder(succ, entry))) # Reverse postorder.

Take this program for example (modified from interp test):

@print(n: int): int {
}
@main: int {
  v: int = const 4;
  jmp .somewhere;
  v: int = const 2;
.somewhere:
  tmp: int = call @print v;
  ret v;
}

Applying to_ssa creates this wrong output

@print(n: int): int {
}
@main: int {
.b1:
  v.0: int = const 4;
  jmp .somewhere;
.b2:
  v.1: int = const 2;
  jmp .somewhere;
.somewhere:
  tmp.0: int = phi __undefined tmp.1 .b1 .b2;
  tmp.1: int = call @print v.0;
  ret v.0;
}

Note the garbage phi instruction tmp.0. This happens because dominators are computed as follows in dom.py (which is then used by to_ssa.py)

{'b1': {'b1'}, 'b2': {'somewhere', 'b1'}, 'somewhere': {'somewhere', 'b1'}}

This is obviously wrong. This happens because get_dom algorithm fails when there's a node/basic block that doesn't have any predecessor that is not reachable from entry.

[Interpreter] Store instruction doesn't work with float pointer

Description

store instruction raises error when writing to a float pointer.

Example

@main() {
  one: int = const 1;
  seed: float = const 109658.0;
  rng: ptr<float> = alloc one;
  store rng seed;
}

Error message

../bril/bril-ts/build/brili.js:789
process.on('unhandledRejection', e => { throw e; });
                                        ^
store argument 1 must be a Pointer

LVN missed opportunity for eq?

As far as I understand, if two values have the same value number, they are guaranteed to be the same at runtime. As a result, for operator eq, if both operands have the same value number (even if they are not constants) we could fold the operator to a constant true.

Fix the Rust interpreter for the new syntax

A "fun" task for any Rustaceans out there: #49 introduced a breaking syntax change, and the brilirs interpreter needs updating to match. In particular, the main necessary change is looking for labels in an instruction's labels key rather than in args.

Modification of add_terminator in cfg.py

Consider a typescript program looks like this:

let value = 8;
let result = 1;
for (let i = value; i > 0; i = i - 1) {
  if (i>5){
    result = result * i;
  }
}

The if statement of this program will be translated to bril like this

  br v10 then.7 else.7;
then.7:
  v11: int = id result;
  v12: int = id i;
  v13: int = mul v11 v12;
  result: int = id v13;
  jmp endif.7;
else.7: 
endif.7:
  v14: int = id i;
...

The translation is correct. But block else.7 is empty.
However, because in our own project we would like to take use of add_terminators in cfg.py, we find block[-1]['op'] not in TERMINATORS does not work since block length of else.7 is empty and there is no block[-1].
It would be better to modify the function to

def add_terminators(blocks):
    for i, block in enumerate(blocks.values()):
        if len(block) == 0:
            dest = list(blocks.keys())[i + 1]
            block.append({'op': 'jmp', 'args': [dest]})
        elif block[-1]['op'] not in TERMINATORS:
        ...

ts2bril error: boolean constant

ts2bril throws an error when it encounters a boolean constant instruction.

For example, the line var x : boolean = true; makes it throw the error: unimplemented type boolean. This occurs with or without the type annotation in the code.

Specify how `print` works for `float` values

Should other parts of floating point be specified as well, specifically how a compliant interpreter/runtime outputs Nan/Infinity?

Originally posted by @Pat-Lafon in #214 (comment)

Also, we should specify that printing "normal" floats should always use 17 decimal digits, as discussed in #214 (comment).

[SSA] Buggy SSA output with `to_ssa.py` and a modified `if-ssa.bril`

This input, which is exactly if-ssa.bril except the .exit label is renamed to .xit causes buggy SSA to be generated by to_ssa.py:

@main(cond: bool) {
.entry:
    a.1: int = const 47;
    br cond .left .right;
.left:
    a.2: int = add a.1 a.1;
    jmp .xit;
.right:
    a.3: int = mul a.1 a.1;
    jmp .xit;
.xit:
    a.4: int = phi .left a.2 .right a.3;
    print a.4;
}

The generated SSA is:

@main(cond: bool) {
.entry:
  a.1.0: int = const 47;
  br cond .left .right;
.left:
  a.2.0: int = add a.1.0 a.1.0;
  jmp .xit;
.right:
  a.3.0: int = mul a.1.0 a.1.0;
  jmp .xit;
.xit:
  a.2.1: int = phi a.2.0 a.2.0 .left .right;
  a.4.0: int = phi a.2.1 a.3.1 .left .right;
  print a.4.0;
  ret;
}

The buggy line is a.4.0: int = phi a.2.1 a.3.1 .left .right;. When called with cond = false the variable a.3.1 is not defined, causing a.4.0 to be undefined.

This is caused by the function prune_phis():

bril/examples/to_ssa.py

Lines 91 to 126 in d65455d

    
           def prune_phis(pred, phi_args, phi_dests): 
        
               """Prune possibly-undefined phi-nodes. 
        
               Ordinary phi insertion will create phi-nodes that are "partially 
        
               undefined" because they represent a convergence of paths where the 
        
               variable is defined along some but not all paths. These phi-nodes 
        
               are useless because it is illegal to read from the result. And they 
        
               can confuse the out-of-SSA pass because it creates nonsensical 
        
               copies. This algorithm iteratively eliminates such phi-nodes, 
        
               propagating through to eliminate consumer phi-nodes until 
        
               convergence. 
        
               """ 
        
               # We build up a set of new names (phi destinations) to prune, and we 
        
               # iterate until this set stops growing. 
        
               old_prune_len = -1 
        
               prune = set() 
        
               while len(prune) != old_prune_len: 
        
                   old_prune_len = len(prune) 
        
                   # Look at each phi. 
        
                   for block, args in phi_args.items(): 
        
                       dests = phi_dests[block] 
        
                       for v, v_args in args.items(): 
        
                           # How many non-pruned arguments does this phi have? 
        
                           live_args = [a for _, a in v_args if a not in prune] 
        
                           if len(live_args) < len(pred[block]): 
        
                               # Prune phis with insufficient arguments. 
        
                               prune.add(dests[v]) 
        
               # Actually delete all phis with pruned destinations. 
        
               for block, args in phi_args.items(): 
        
                   dests = phi_dests[block] 
        
                   to_del = {v for v, d in dests.items() if d in prune} 
        
                   for v in to_del: 
        
                       del args[v] 
        
                       del dests[v]

This bug is triggered with the new input, and not triggered by if-ssa.bril because of the sorting done on this line:

bril/examples/to_ssa.py

Line 79 in d65455d

for b in sorted(domtree[block]):

I believe that this optimization is not sound, even in the case when the input does not have phi instructions. Imagine the input:

@main() {
    cond: bool = const true;
    br cond .true .false;
.true:
    a: int = const 0;
    jmp .zexit;
.false:
    b: int = const 1;
    jmp .zexit;
.zexit:
    print a;
}

Because cond is always true, a is always defined. But, to_ssa.py generates incorrect code similar to before:

@main {
.b1:
  cond.0: bool = const true;
  br cond.0 .true .false;
.true:
  a.0: int = const 0;
  jmp .zexit;
.false:
  b.0: int = const 1;
  jmp .zexit;
.zexit:
  b.1: int = phi b.0 b.0 .false .true;
  print a.1;
  ret;
}

You could also argue that variables MUST be defined on all paths for the bril program to be well-formed. But, I don't think that is a reasonable requirement. And instead it should instead be undefined behavior to read from an undefined variable. That leaves the prune_phis optimization as it is currently implemented unsound.

Instead, prune_phis() should probably insert undefined variables like __undefined, or read from the output variable.

Fix example LVN pass's copy propagation on block inputs

In the examples, lvn.py -p currently fails for this example:

@main {
  a: int = const 42;
.lbl:
  b: int = id a;
  a: int = const 5;
  print b;
}

The optimization incorrectly changes the last line to print a.

The problem is the interaction between copy (id) propagation and basic block inputs (i.e., the fact that a is read before it is written in the second block). One annoying but cleanish solution would be to copy a to a fresh variable right as the block begins because it is read before it is written. Another, much less clean solution would be to detect the first overwrite of a and do the renaming at that point, or to somehow avoid doing copy propagation for the id because the value will be clobbered later.

Unnecessary phi instructions insert by to_ssa

Here's an example bril for interp test add-overflow

@pow(base: int, exp: int): int {
  out: int = const 1;
  one: int = const 1;
.loop:
  end: bool = lt exp one;
  br end .ret .body;
.body:
  out: int = mul out base;
  exp: int = sub exp one;
  jmp .loop;
.ret:
  ret out;
}

Converting this to ssa using to_ssa.py gives following output

@pow(base: int, exp: int): int {
.b1:
  out.0: int = const 1;
  one.0: int = const 1;
  jmp .loop;
.loop:
  out.1: int = phi out.0 out.2 .b1 .body;
  exp.0: int = phi exp exp.1 .b1 .body;
  end.0: bool = phi __undefined end.1 .b1 .body;
  end.1: bool = lt exp.0 one.0;
  br end.1 .ret .body;
.body:
  out.2: int = mul out.1 base;
  exp.1: int = sub exp.0 one.0;
  jmp .loop;
.ret:
  ret out.1;
}

Note how end.0: bool = phi __undefined end.1 .b1 .body; is inserted and it's not necessary to have done that. In fact it's not used anywhere and tdce would remove it.

I will try to figure out where this issue is coming from.

Bril v2.0 syntax refactor: separate labels and functions from arguments

Processing the IR is significantly more annoying because of the syntactic simplicity in Bril 1.0: in jmp foo, the foo argument, which is a label, looks exactly the same as bar and baz in add bar baz. Special-case handling is required to separate actual variable references from labels and function names.

We should separate these in the syntax. This change will break plenty of stuff but is probably worth it.

[LVN] Missed constant fold

I was checking my implementation against the examples, and found this:
bril2json < overwritten-variable.bril | python3 examples/lvn.py -p -c -f | bril2txt

@main {
  v1: int = const 4;
  v2: int = const 0;
  mul1: int = mul v1 v2;
  v2: int = const 3;
}

Lowers to:

@main {
  v1: int = const 4;
  lvn.2: int = const 0;
  mul1: int = mul v1 lvn.2; // <-- This should be folded.
  v2: int = const 3;
}

Expected output:

@main {
  v1: int = const 4;
  lvn.2: int = const 0;
  mul1: int = const 0;
  v2: int = const 3;
}

Division outputs float rather than integer.

main {
v0: int = const 9;
v1: int = const -20;
res: int = div v0 v1;
print res;
}
This code produces -0.45 rather than 0 as expected.

[DF] defined analysis producing wrong output.

Example

# bril2json < p.bril | python3 ../../df.py defined

@main(awesome_integer: int) {
.entry:
  print awesome_integer;
}

Actual:

entry:
  in:  ∅ 
  out: ∅

Expected

awesome_integer should be defined in .entry.

entry:
  in:  awesome_integer
  out: awesome_integer

I think the issue here is two-fold.

There's no simple way to pass in arguments to the analyses. Here, we pass in an analysis with a defaulted (empty) data structure. However, for an analysis such as reaching definitions or defined analysis, we want to pass in the function arguments to the first block.

bril/examples/df.py

Lines 82 to 92 in dbbd8e8

    
           def run_df(bril, analysis): 
        
               for func in bril['functions']: 
        
                   # Form the CFG. 
        
                   blocks = cfg.block_map(form_blocks(func['instrs'])) 
        
                   cfg.add_terminators(blocks) 
        
                   in_, out = df_worklist(blocks, analysis) 
        
                   for block in blocks: 
        
                       print('{}:'.format(block)) 
        
                       print('  in: ', fmt(in_[block])) 
        
                       print('  out:', fmt(out[block]))

Even if we do pass in the correct function arguments,

bril/examples/df.py

Lines 48 to 49 in dbbd8e8

inval = analysis.merge(out[n] for n in in_edges[node])

in_[node] = inval

inval = analysis.merge(out[n] for n in in_edges[node]) # in_edges[.entry] == []

So, even though in_[.entry] is initialized correctly, it is over-written afterward.

As usual, let me know if I'm off-target.

Extend floating point extension with isnan, isnf, and integer conversions

I am also interested in the following:

If there should be 2 additional operators to the floating point extension, isnan and isinf, which would match similar behavior as seen in Python. Though I guess that this could be implemented as a small function calls in a Bril program and then used via the import extension in the niche cases that would want them.

If Bril should allow conversion between integers and floating point values with an operator like toint/tofloat.

Originally posted by @Pat-Lafon in #214 (comment)

brili: Problem when deserializing large integer constants from JSON

To reproduce:

❯ cat test/overflow.bril
@main() {
  x: int = const 9223372036854775807;
  one: int = const 1;
  three: int = const 3;
  y: int = add x one;
  print y;
  y: int = mul x three;
  print y;
}
❯
❯ bril2json < test/overflow.bril | brili
-9223372036854775807
-9223372036854775808
❯ bril2json < test/overflow.bril | brilirs
-9223372036854775808
9223372036854775805
❯

The Bril documentation on integers writes:

int: 64-bit, two’s complement, signed integers.

I wonder which one is correct?

Bril Type Error questions

2 benchmarks (perfect.bril and digital-root.bril) have ret instructions at the end of their main functions that return values despite them not having a return type. I guess this is a type error?

A more general question. From the type system perspective, if a function both returns a value and has a side effect, does it need to be called as a Value operation (because it returns something) or can it be called as an Effect operation (and just ignore the return value)?

Not takes 1 argument

bril/bril-ts/brili.ts

Line 16 in 421042a

not: 2,

I believe this should say that not takes 1 argument instead of 2.

[Documentation] Missing installation instructions for ts2bril, brilck

Please mentione in doc that user must install ts2bril in order to run tests:

$ deno install ts2bril.ts --allow-all
$ deno install brilck.ts --allow-all

And probably fix existing instruction for brili:

$ deno install brili.ts --allow-all

Consider replacing --allow-all with more granular permissionns.

The Brilirs memory extension implementation is very slow

As was shown in #189 (comment). The current memory extension implementation for brilirs is surprisingly slow.

Currently, there are no good benchmarks that are memory intensive enough which makes it difficult to know what to optimize for. The closest program seems to be test/mem/alloc_large.bril which only allocates and frees the same sized chunk of memory in a loop.

The current implementation of Heap which supports the memory extension in brilirs is ported from brili and could be greatly improved.

A couple of possible pain points include.

The Heap is implemented as a map from the "base" of the pointer to a Vec<Value>. This is nice in that it grows and shrinks with the amount of memory in use at any given. However, the use of maps like HashMap<usize, Vec<Value>> has historically been a performance hit in brilirs due to needing to hash the key and perform the lookup.
- Ideally we would compress the Heap into either a Vec<Vec<Value>> or aVec<Value> with the "base" of the pointer just indexing into the start of it's allocation in the array. In either case we will want the ability to reuse indexes as allocations are freed up.
- The implementation using Vec<Vec<Value>> is probably slower than the alternative because it will require two array access to read/write to a value. However, it's probably easier to implement.
- Vec<Value> is more along the lines of implementing something like malloc which gives all of the performance benefits / fragmentation complexities involved.
  - Unlike malloc however, one could implement a form of garbage collection and reassign the indexes of pointers since this implementation detail is not exposed to the user. It's not clear to me how this would affect performance.
- Regardless of the implementation, the Heap needs to still enforce errors on buffer over/underflows, use after frees, use of uninitialized memory, and memory leaks.
The current implementation allocates a new Vec<Value> on every call to alloc. Ideally, brilirs should be able to reuse previously freed allocations. This would especially target alloc_large.bril and potentially provide a significant improvement over brili.
- Two possible solutions for this are either the previously mentioned Heap as just one large Vec<Value> or using something like a slab allocator and delegating this reuse to an external library.
Unlike with stack allocation where it is statically known how many variables are in scope, memory allocation is dynamic which limits the interpreters ability to pre-allocate capacity for the Heap to use during the course of interpretation.
- brilirs already employs some static analysis for type checking and it would be interesting if it could use a data flow analysis to get a very rough estimate on whether it should start out with a small or large Heap size. Some programs don't use the memory extension or use it sparingly which is the default case. On the other hand, we also want to optimize for programs which make of intense use of memory.

[Incorrect] Domination frontier including self in frontier.

Hi! Checking my results against the examples again.

From the lecture notes,
We say A's domination frontier contains B if

A does not strictly dominate* B
A dominates a predecessor of B.

*I think this was just dominates rather than strictly dominates in the video lecture. However it is defined correctly in the Gist

bril2json < class-example.bril | python3 examples/dom.py front

# Example used for Global Analysis in L5.
@main {
.entry:
  x: int = const 0;
  i: int = const 0;
  one: int = const 1;
  jmp .loop;

.loop:
  max: int = const 10;
  cond: bool = lt i max;
  br cond .body .exit;

.body:
  mid: int = const 5;
  cond: bool = lt i mid;
  br cond .then .endif;

.then:
  x: int = add x one;
  jmp .endif;

.endif:
  factor: int = const 2;
  x: int = mul x factor;
  i: int = add i one;
  jmp .loop;

.exit:
  print x;
  ret;
}

Results in:

# Frontiers
{
  "body": [
    "loop"
  ],
  "endif": [
    "loop"
  ],
  "entry": [],
  "exit": [],
  "loop": [
    "loop" # ???
  ],
  "then": [
    "endif"
  ]
}

Does B = loop meet the frontier criteria for A = loop?

loop does not strictly dominate loop <-- does hold since, loop == loop.
loop dominates an immediate predecessor of loop. <-- does not hold. The only predecessor to loop is entry, which is not dominated.

Let me know if something looks off with my explanation.

[Typescript Compiler] Add for loops to typescript compiler

The typescript compiler doesn't seem to handle for loops right now.

It seems like one way to handle it would be to:

Make a new loop name
Set up the iterator variable assignment once
Set up the iterator variable increment in the body of the loop
Set up the end condition computation, and jump back after

One issue I guess with adding for loops is that it's not super obvious which loops users should expect to work or not, and so it might create a lot of implied work (to handle other loops), but maybe a decent way of handling it would just be to see if the user seems to be using some type of loop, and have the error message explain which type of loop they can do.

Reference interpreter errors could be more useful

A cool thing to add to the brili reference interpreter would be more informative messages about error locations. We can't give source locations, because the interpreter runs on ASTs instead of source code, but we could indicate which instruction index inside which function produced the error.

Rust code doesn't conform to commonly accepted `unsafe` semantics

Hi,

So I'm not very familiar with the project, but I've done a fair bit of Rust and one of my friends is currently taking 6120, so I looked a little bit through the code. It seems some of the Brilrs code isn't using the commonly accepted Rust definitions of unsafe, which means that one could accidentally run into undefined behavior. Now these issues aren't actually being realized at the moment, but it might be good to conform to Rust standards to ensure correctness in the future.

The commonly accepted Rust standard is to say that

A function should be marked as unsafe if it's possible to cause undefined behavior by calling it from safe code.

If one were to do (in entirely safe code)

let bbprog: BBProgram = prog.try_into()?;
interp::execute_main(&bbprog, out, &input_args, profiling)?;

one could actually cause undefined behavior if the program passed into execute_main is not a valid program. This is because (if I'm reading correctly) execute_main assumes the program has been type-checked

bril/brilirs/src/interp.rs

Lines 184 to 218 in 30c836c

    
           impl From<&Value> for i64 { 
        
             #[inline(always)] 
        
             fn from(value: &Value) -> Self { 
        
               if let Value::Int(i) = value { 
        
                 *i 
        
               } else { 
        
                 // This is safe because we type check the program beforehand 
        
                 unsafe { unreachable_unchecked() } 
        
               } 
        
             } 
        
           } 
        
           impl From<&Value> for bool { 
        
             #[inline(always)] 
        
             fn from(value: &Value) -> Self { 
        
               if let Value::Bool(b) = value { 
        
                 *b 
        
               } else { 
        
                 // This is safe because we type check the program beforehand 
        
                 unsafe { unreachable_unchecked() } 
        
               } 
        
             } 
        
           } 
        
           impl From<&Value> for f64 { 
        
             #[inline(always)] 
        
             fn from(value: &Value) -> Self { 
        
               if let Value::Float(f) = value { 
        
                 *f 
        
               } else { 
        
                 // This is safe because we type check the program beforehand 
        
                 unsafe { unreachable_unchecked() } 
        
               } 
        
             } 
        
           }

which is not enforced by the Rust compiler. Thus one can cause undefined behavior in entirely safe code, which is against Rust conventions. As I said, this is of course not a real problem and it doesn't currently ever materialize itself, but it might be nice to write better Rust.

There's a couple possible solutions to this, if it is worth solving:

Have type_check return some sort of TypedProgram, and execute_main can only take in a TypedProgram, guaranteeing that the program has been type-checked. This solution makes the most sense to me.
Mark execute_main as unsafe. This is probably the least work but is very sad.
An unreasonable solution: rework nodes/type-checking so that type-checking returns a new tree that doesn't even have the nodes that should be unreachable. This is very Haskell-y and a huge pain to do.

[Interpreter] Float argument support for main

The current interpreter for Bril does not support float arguments to main.
Here's a simple piece of code:

@main (x1 : int, x2: float) {
      print x1;
      print x2;
}

The output of this is as follows:

$ bril2json < floats.bril | brili 2 3.0
2
error: undefined variable x2

x1 got printed successfully because it's an int, but it seems like the argument x2 results in an error because it's a float.

Remove unnecessary line (I think)

Does this line do anything?

bril/bril-ts/ts2bril.ts

Line 184 in 104bc00

for_.incrementor

Reorganize interpreter tests based on extensions

The interpreter tests are not currently broken down into directories based on the Bril extensions they use. (Except for the mem directory, which dates back to the original Fall 2019 project that implemented it.) We should do that—each extension should get its own directory. That will make it easer to differentially test against other interpreter implementations that don't support all the extensions, such as the Rust interpreter, as discussed in #104 (comment).

brilirs fails on seemingly well-formed program

As part of my dataflow analysis lesson task, I came up with some funky Bril programs to stress the worklist algorithm. The following program (which is easier to understand when the order of the basic blocks is reversed) fails with brilirs -p false false but passes brili -p false false and brilck. It also seems to obey the list of rules given in the Bril documentation.

@main(b0: bool, b1: bool) {
    jmp .start;
  .end:
    print x_0_2;
    print x_1_2;
    ret;
  .l_1_3:
    jmp .end;
  .l_1_2:
    x_1_2 : int = const 0;
    jmp .l_1_3;
  .l_1_1:
    x_1_1 : int = const 1;
    jmp .l_1_3;
  .l_0_3:
    br b1 .l_1_1 .l_1_2;
  .l_0_2:
    x_0_2 : int = const 2;
    jmp .l_0_3;
  .l_0_1:
    x_0_1 : int = const 3;
    jmp .l_0_3;
  .start:
    br b0 .l_0_1 .l_0_2;
}

For reference, here is the Python code I used to generate the program. brilirs has been super useful thus far, but I'm not super familiar with Rust so it'd be nice if a Rustacean could help us out here :)

brilirs: Support function calls

#98 brought the fast Rust interpreter up to date. It currently supports all of the core language except function calls, i.e., the call instruction. We should add that someday!

	def prune_phis(pred, phi_args, phi_dests):
	"""Prune possibly-undefined phi-nodes.

	Ordinary phi insertion will create phi-nodes that are "partially
	undefined" because they represent a convergence of paths where the
	variable is defined along some but not all paths. These phi-nodes
	are useless because it is illegal to read from the result. And they
	can confuse the out-of-SSA pass because it creates nonsensical
	copies. This algorithm iteratively eliminates such phi-nodes,
	propagating through to eliminate consumer phi-nodes until
	convergence.
	"""
	# We build up a set of new names (phi destinations) to prune, and we
	# iterate until this set stops growing.
	old_prune_len = -1
	prune = set()
	while len(prune) != old_prune_len:
	old_prune_len = len(prune)

	# Look at each phi.
	for block, args in phi_args.items():
	dests = phi_dests[block]
	for v, v_args in args.items():
	# How many non-pruned arguments does this phi have?
	live_args = [a for _, a in v_args if a not in prune]
	if len(live_args) < len(pred[block]):
	# Prune phis with insufficient arguments.
	prune.add(dests[v])

	# Actually delete all phis with pruned destinations.
	for block, args in phi_args.items():
	dests = phi_dests[block]
	to_del = {v for v, d in dests.items() if d in prune}
	for v in to_del:
	del args[v]
	del dests[v]

	def run_df(bril, analysis):
	for func in bril['functions']:
	# Form the CFG.
	blocks = cfg.block_map(form_blocks(func['instrs']))
	cfg.add_terminators(blocks)

	in_, out = df_worklist(blocks, analysis)
	for block in blocks:
	print('{}:'.format(block))
	print(' in: ', fmt(in_[block]))
	print(' out:', fmt(out[block]))

	inval = analysis.merge(out[n] for n in in_edges[node])
	in_[node] = inval

	impl From<&Value> for i64 {
	#[inline(always)]
	fn from(value: &Value) -> Self {
	if let Value::Int(i) = value {
	*i
	} else {
	// This is safe because we type check the program beforehand
	unsafe { unreachable_unchecked() }
	}
	}
	}

	impl From<&Value> for bool {
	#[inline(always)]
	fn from(value: &Value) -> Self {
	if let Value::Bool(b) = value {
	*b
	} else {
	// This is safe because we type check the program beforehand
	unsafe { unreachable_unchecked() }
	}
	}
	}

	impl From<&Value> for f64 {
	#[inline(always)]
	fn from(value: &Value) -> Self {
	if let Value::Float(f) = value {
	*f
	} else {
	// This is safe because we type check the program beforehand
	unsafe { unreachable_unchecked() }
	}
	}
	}

sampsyo / bril Goto Github PK

bril's People

Contributors

Stargazers

Watchers

Forkers

bril's Issues

Description

Example

Error message

Example

Actual:

Expected

Recommend Projects

Recommend Topics

Recommend Org