Comments (6)
Aha, that's interesting! You're quite right, of course, that the latter example, should not prune the phi
for a
.
Anyway, dealing with undefined variables is extremely tricky here! I'd be interested in any straightforward fixes that don't make things any more complicated (it's fine if they generate more useless phi
s). I'm curious what you're thinking with the two solutions… how would you tell when we should be reading from __undefined
or the output variable or whatever?
Maybe there's a way to fix the pruning too. Not sure.
from bril.
I’ll put up a PR that fixes the bug, so to_ssa.py generates correct code. Then I’ll think about a different pruning approach.
from bril.
One thing that pops out now is that the resulting code after from_ssa
is now littered with x: <type> = id __undefined
. Given that x
is undefined at that point, we should be able to remove it.
To demonstrate on the second example above,
@main {
.b1:
cond.0: bool = const true;
br cond.0 .true .false;
.true:
a.0: int = const 0;
b.1: int = id b.0; # ???
a.1: int = id a.0;
jmp .zexit;
.false:
b.0: int = const 1;
b.1: int = id b.0;
a.1: int = id __undefined; # Remove this instruction, since it should be unreachable.
jmp .zexit;
.zexit:
print a.1;
ret;
}
Maybe there's a reason why this is not a good idea?
Edit: It also looks like there is a bug, since we have the id
of a variable that hasn't been defined yet. It originates from the SSA form:
b.1: int = phi b.0 b.0 .false .true; # Should be phi b.0 __undefined .false .true
I think this is along the lines of what you're trying to fix?
Edit2: To build on this a little more, We don't want phi b.0 b.0 .false .true
since b
is undefined in the .true
branch. Rather, we only want to add the variable name if b
's definition reaches the given branch. Otherwise, we know the value is undefined at that point.
The fix I propose is on the following line:
Line 75 in dbbd8e8
We append the following condition:
if stack[p] and any(b in out_[s][p] for b in preds_of_block + [block]) # The definition of p reaches the successor for this block.
When we pipe this through from_ssa
, it gives us the desired behavior:
@main {
.entry:
cond.0: bool = const true;
br cond.0 .true .false;
.true:
a.0: int = const 0;
b.1: int = id __undefined; # Good!
a.1: int = id a.0;
jmp .zexit;
.false:
b.0: int = const 1;
b.1: int = id b.0;
a.1: int = id __undefined;
jmp .zexit;
.zexit:
print a.1;
ret;
}
I need to do a bit more thinking / some tests, since I'm sure there's some bits I'm missing, such as definitions being in predecessors, but not the immediate block.
from bril.
I think you're on the right track, although I don't know 100% that this will fix the problem. (Maybe it will?) In short, just removing the copies from __undefined
themselves won't work because it's necessary but not sufficient: i.e., partial transitive copies from undefined values are still a problem. Testing broadly will be important…
from bril.
I know this issue is old, but I want to write up a summary of where things stand on this sad state of affairs. Basically, writing an SSA conversion in a way that deals elegantly with undefined values turns out to be pretty frustrating. To re-illustrate the basic problem, consider this simple program (called if-const.bril
in our tests):
@main() {
cond: bool = const true;
br cond .true .false;
.true:
a: int = const 0;
jmp .exit;
.false:
b: int = const 1;
jmp .exit;
.exit:
print a;
}
The thing to notice about this program is that it is only safe to run because the branch happens to always go to the .true
side. So a
is always defined in print a
, even though it wouldn't be if the branch were to go the other way (which, again, is impossible).
If we follow our SSA conversion algorithm to the letter, we'd end up first renaming the assignment to a
thus:
a.0: int = const 0;
And generating a phi-node that must look something like this in .exit
:
a.1: int = phi .true a.0 .false ???;
But what should the value of a.1
be if we traverse the edge from the .false
block? The original variable a
was undefined, so we need to simulate that!
Our current "solution" is to just replace that ???
with a reference to an actually undefined variable. So we hallucinate the name __undefined
and generate a phi-node that looks like this:
a.1: int = phi .true a.0 .false __undefined;
…the idea being that this edge will never actually be traversed, so it should be fine to refer to a variable that doesn't actually exist here. Easy peasy, right?
Not so fast—this works until we try to generate the phi-node for b
in .exit
:
b.1: int = phi .true __undefined .false b.0;
Oh no! The CFG edge from .true
will actually be traversed… so this SSA-ified program will end up referring to an undefined variable and crashing, even though the original program was just fine.
To make this bad scenario more concrete (because running programs directly in SSA form is kinda weird), consider converting this program back out of SSA form, which means inserting assignments into the predecessor blocks. Namely, here, we'll put this into the end of the .true
block (remember, that's a block that actually runs):
b.1: int = id __undefined;
The interpreter will, understandably, not like this line.
Again, we have a clever "solution": let's just run dead code elimination! This b.1
variable can't possibly be used. One way to put it is that it is guaranteed that this version of the original b
variable was never used downstream of the .true
block because, if it were, that original program would have been illegal by virtue of reading an undefined variable itself. So, we can just run our trivial dead code elimination pass to get rid of this pesky undefined variable! And that mostly works.
The only problem is that trivial dead code elimination is trivial. It can't identify more sophisticated situations where these needless generated assignments are inserted. The most dastardly case is where a cycle exists between two generated SSA variables, both of which are dead, but neither of which looks trivially dead in isolation. This comes up if you write a loop that assigns to a variable (that isn't assigned outside of the loop), like in this code from loop-branch.bril
:
@loop(infinite: bool, print: bool) {
.entry:
.loop.header:
br infinite .loop.body .loop.end;
.loop.body:
br print .loop.print .loop.next;
.loop.print:
v: int = call @func;
print v;
.loop.next:
jmp .loop.header;
.loop.end:
}
In this code, v
needs a phi-node at the top of the loop (the .loop.header
block). The value for v
there needs to come either from the loop itself or from the function entry, in which case it's undefined. And .loop.next
needs a phi-node of its own to pick between the call
and the previous iteration of the loop, i.e., the phi-node at the top of the loop. Here's our current version of the SSA-ified code:
@loop(infinite: bool, print: bool) {
.entry:
jmp .loop.header;
.loop.header:
v.0: int = phi __undefined v.1 .entry .loop.next;
br infinite .loop.body .loop.end;
.loop.body:
br print .loop.print .loop.next;
.loop.print:
v.2: int = call @func;
print v.2;
jmp .loop.next;
.loop.next:
v.1: int = phi v.0 v.2 .loop.body .loop.print;
jmp .loop.header;
.loop.end:
ret;
}
The problem here is that, if you insert all the copies to reflect these phi-nodes, the __undefined
copy will not be dead. v.0
is used later in v.1
, and then v.1
is used in v.0
. We have a dead cycle: all the code is dead, but it's "hard" to eliminate the code because it would require detecting cycles (rather than the simple "never used anywhere" check we do for trivial DCE). Annoying!
In #119, @terrelln had the bright idea to just define all these variables. SSA conversion would just insert actual variables like __undefined.int
at the top of the function to give zero/null values that could be used by all these undefined cases. That way, we wouldn't be doing the weird hack where we refer to literally undefined variables ever, so code would always be safe from crashing after SSA conversion.
I like this solution well enough, but it has a couple of drawbacks:
- It requires a null value to exist for every type! As new types are added, the SSA code has to know how to generate null values for every type. This seems sort of inconvenient to maintain in the face of new extensions. In the memory extension in particular, it implies that we needed to add null pointers everywhere (the "billion dollar mistake").
- It seems vaguely unfortunate that we need to generate all this code for null values that is guaranteed to be dead. It seems like a lot of extra effort to create values that are certain never to be read by well-formed programs. For the sake of clarity of the conversion, it would be super great if we didn't have to intentionally generate all this dead code—if we could guarantee instead that anything like this would be removable.
So there it currently stands. I would love to find a solution that avoids the above problems while not adding too much complexity to the rest of Bril or to the SSA conversion algorithm itself. (The example is structured this way to closely mirror the idealized pseudocode; crapping it up too much just to deal with this weird edge case of undefined values would be a shame.)
Something in particular that I like about our current (flawed) approach is that it is possible to generate working SSA code with the "naive" SSA conversion algorithm; you don't need to do some sort of fancy "minimal SSA" conversion just to get correct code. It would be awesome to preserve that property in the face of undefined values.
One option I'm considering is redefining the phi
instruction to explicitly support cases where the value is undefined. Then, it may be possible to avoid generating copies for undefined cases altogether. This may not work because of the need to transitively eliminate undefined copies, but maybe it will make the undefined values easier to clean up using trivial DCE.
from bril.
Just one small addendum on the history here: our previous "solution" (before the whole __undefined
business, which was introduced in #111) was to just allow missing edges in phi-nodes. So in the first example in my previous comment, we would not generate a phi-node like this:
a.1: int = phi .true a.0 .false __undefined;
But rather like this:
a.1: int = phi .true a.0;
That is, we could omit the .false
branch entirely. Our SSA-aware interpreter would just crash if it ever executed a phi
that failed to handle the relevant just-traversed CFG edge. That crash is fine because traversing that edge would leave a
undefined in the original program and crash anyway.
The problem with that approach was again "transitive deadness." When translating back out of SSA form, the (lone) copy that implements this phi-node for a.1
is not a problem. But if there were a later phi-node that uses it, perhaps involved in a cycle like our loop example above, then we may end up inserting copies of variables that don't exist.
from bril.
Related Issues (20)
- [Incorrect] Domination frontier including self in frontier. HOT 5
- Reorganize interpreter tests based on extensions HOT 2
- [DF] defined analysis producing wrong output. HOT 1
- Implementing the speculative execution extension for brilirs HOT 1
- Specify divide-by-zero semantics HOT 6
- [Interpreter] Store instruction doesn't work with float pointer HOT 3
- Add type checker to GitHub Actions
- [Interpreter] Float argument support for main HOT 3
- [Typescript Compiler] Add for loops to typescript compiler HOT 1
- Rust code doesn't conform to commonly accepted `unsafe` semantics HOT 2
- The Brilirs memory extension implementation is very slow HOT 4
- No "Operations" section in "Syntax" page HOT 3
- Specify how `print` works for `float` values
- Extend floating point extension with isnan, isnf, and integer conversions
- Semantics of calling print on a pointer? HOT 5
- Show: Bril Playground HOT 3
- [Documentation] Missing installation instructions for ts2bril, brilck HOT 4
- brilirs fails on seemingly well-formed program HOT 2
- brili: Problem when deserializing large integer constants from JSON HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bril.