This page is built on Hakyll. Building and running a local server:
cabal run site clean && cabal run site watch
Companion code for paper "Imperative Functional Programs that Explain their Work", Wilmer Ricciotti, Jan Stolarek, Roly Perera and James Cheney, ICFP 2017, Oxford, UK
Home Page: http://dl.acm.org/citation.cfm?id=3110258
License: GNU General Public License v3.0
Currently resugaring and pretty-printing are combined into one pass. It would be better to have a resugaring pass that restores original surface syntax and then we could pretty-print that syntax.
InferTy
to the type language to annotate holesAt the moment type checking is merged into one pass with desugaring and is a bit ad-hoc. It might be worth to implement a separate type-checking pass that would run before desugaring. The primary motivation for this is presence of exceptions, which can inhabit every type. At the moment this is implemented as a very inelegant special case: internally there is a separate type of exceptions ExnTy
and the fact that it inhabits every type is achieved by overriding ==
and some helper functions, eg.:
instance Eq Type where
ExnTy == _ = True
_ == ExnTy = True
-- standard equations follow (...)
-- | Is function type?
isFunTy :: Type -> Bool
isFunTy (FunTy _ _) = True
isFunTy ExnTy = True
isFunTy _ = False
This does not scale in any way. A proper solution would be to have a type inference with unification. Then raise "foo"
would have polymorphic type a
and unification would find instantiation of type variables.
Another advantage of having type inference would be allowing exceptions to be raised inside a case
scrutinee. At the moment this is not allowed because of the way we desugar let expressions, ie. assuming that we know the exact type of scrutinee and desugar branches based on that.
let t6 = trace ( 1 + (raise "foo") ) in
bwdSlice (t6, raise "foo")
slices to 1 + (raise "foo")
but I think it should be _ + (raise "foo")
.
Test proportion
fails on js-really-separate-asts
branch. Reason: after separating Exp
and Trace
into separate syntax trees it is no longer possible to store values returned by pslice
inside VTrace
. As a dirty ad-hoc solution I added VExp
constructor to Value
data type but otherwise have not implemented any kind of support for it. This shows in the said test.
If I say
fst (raise "foo") + 2
this parses as fst ((raise "foo") + 2)
rather than (fst (raise "foo")) + 2
. I've run into similar issues several times before. The problem is that operators bind tighter than application but it should be the other way around. Perhaps this can be solved by modifying the parser. If not then we might need a separate pass to restore correct operator precedence (I think GHC does something like that).
In the process of cleaning up the code I removed some features. It would be nice to catalogue those in some document stored in the repo.
Things to delete:
This might be useful to display results of slicing. It might be useful to have a type-class for things that can be resugared:
class Resugarable a where
resugar :: a -> DesugarM RExp
See also #24
In the slicing code there is a ton of boilerplate arising from explicit passing of store and rho. This can be made much shorter if we use a monad.
Currently in Trace
data type we have:
| TIfThen Trace Exp Exp Trace -- ^ Take "then" branch of if
| TIfElse Trace Exp Exp Trace -- ^ Take "else" branch of if
Exp
arguments don't seem to be used in any way.
Right now, when throwing an error we don't pretty print the offending expressions but simply show the syntax tree, eg.:
desugarM (A.Deref e)
= do (e', ty) <- desugarM e
unless (isRefTy ty) $
desugarError ("Dereferenced expression (" ++ show e ++
") does not have a reference type")
return (EDeref e', fstTy ty)
We could do better by resugaring and pretty-printing expressions in error messages, but that is a bit tricky. At the moment we can go from values and core expressions into resugared syntax. The resugaring function, however, is partial. So if the expression is ill-formed and does not type-check then most likely it will also cause a resugaring error. I believe this can be solved easily by adding a desugaring from surface syntax into resugared syntax - it should be possible then to write a total resugaring function that would be safe to use when throwing an error. Oh, and we're also throwing errors during evaluation. But expressions that are evaluated should be well-typed and thus, hopefully, safe to resugar (that's wishful thinking here - haven't checked!).
It might also be worth to:
Define a type class for data types that can be resugared:
class Resugarable a where
resugar :: a -> DesugarM RExp
Create a function pp :: Pretty a => a -> DesugarM String
- explicit calls to show
are a bit tedious.
Comments in REPL scripts don't parse, unless they follow a legal expression. This requires some hackery to fix. At the moment loadFileToRepl
has a hack to filter out empty lines. Perhaps it would make sense to add a fix there and drop those lines that contain only comments. That feels wrong though because it means duplicating parser logic outside of parser. An alternative would be to fix the REPL parser to accept comment-only lines and empty lines as well. The tricky part is that we currently assume that a successfully parsed line contains either a data declaration or an expression. I tried to return an empty data context when parsing a comment - in theory this should work but in practice it didn't and I don't have time to investigate further now.
I want to be able to load a file when starting a REPL. In theory haskeline has some kind of support for doing this but it didn't work out-of-the-box when I initially hacked the REPL. Need to investigate - this would be a very useful feature.
New imperative constructors (references, exceptions, loops and arrays) are completely ignored in the Visualize
module. Need to carefully update the code - carefully, because pattern exhaustiveness checker is of no use when we're matching elements in pairs.
Current implementation of backward slicing has quadratic complexity. This is caused by the fact that for every node we compute a set of store labels that it writes to:
bwdSliceM :: Outcome -> Trace -> SliceM (Env Value, Exp, Trace)
bwdSliceM outcome trace = do
allStoreHoles <- allStoreHolesM (storeWrites trace) <------- HERE
case (outcome, trace) of
This is redundant - we could simply cache at each trace node store labels that the trace writes to. So the idea is to annotate trace with labels. This can be local to the Slice
module - no need to expose that anywhere else.
Here's a plot showing how the runtime of sum-slice-size
example increases as the input list becomes larger:
During backward slicing it is possible that VStar
values are introduced into the returned environment. There's an extract function for dealing with this. I have not yet run into a situation where this would manifest, ie. where stars returned in the environment would cause an exception. I think this would only show as a resugaring exception. For now I'm leaving this unfixed until someone actually writes a program that hits this problem.
Consider these two programs:
let f = fun f (n : int) : int => n + 1 in
f
let f = fun f (n : int) : int => n + 1 in
let f = fun f (n : int) : int => n + 1 in
f
I expect both to produce the same output:
val it = fun f (n) => n + 1 : (int -> int)
However, the second program produces:
val it = f.fun f (n) => n + 1 in
fun f (n) => n + 1 : (int -> int)
I originally encountered this bug when I implemented the REPL and the REPL has a great deal of code to work around the problem - see Notes [Handling let bindings] and [Backup REPL State] in Language.Slicer.Repl
module. But the problem seems to be buried in desugaring or evaluation of let-bindings and should be fixed there. This also means simpler REPL code, yay!
We're not implementing forward slicing before the deadline.
No need to run it, just build to make sure it is compatible with the library API.
While working on performance measurements for the paper I noticed an enormous jump in performance between commits 06ca72d (introduction of a benchmark suite, Feb 8, 2017) and d74087b (HEAD of references branch, Feb 23, 2017). By "enormous" I mean many of the tests being about several times faster, with some being several hundred times faster (sic!). In an abuse of git bisect
I have identified the reponsible commit to be 4c25ab8:
commit 4c25ab8cf91b63f497e6f2ff40649710fada9ea2
Author: Jan Stolarek <[email protected]>
Date: Wed Feb 15 10:38:06 2017 +0000
Create Resugarable type class
What this commit does is that instead of having separate functions to resugar expressions and values it introduces a Resugarable
type class and existing functions become part of instance definitions of that class. This is something that was not expected to affect performance in any way and yet it turns out to have a profound impact. Tests with most visible speed-ups are: copy-list
, curried-componentwise-sum
, curried-pointwise-sum
, curried-pointwise-sum-two
, exceptions
(300x faster!), foo
(100x faster), map-increment
, mergesort
, merge
, operators
, refs
(250x faster), reverse-eval
, reverse-trace-profile2
, reverse-trace-profile
, reverse-trace
, simple-closure
, sort
, T13
(670x faster - WTF!?), T2
(320x faster), uncurried-componentwise-sum
(170x faster). I find these speed-ups absolutely unbelievable so a good first step would be for someone to reproduce these results. Run:
$ git checkout 4c25ab8cf91b63f497e6f2ff40649710fada9ea2
$ cabal clean && cabal configure --enable-benchmarks && cabal build --ghc-options="-O2" && cabal bench 1>bench-report-4c25ab8cf91b63f497e6f2ff40649710fada9ea2-org.txt
$ cat bench-report-4c25ab8cf91b63f497e6f2ff40649710fada9ea2-org.txt | grep -v "unProfile" > bench-report-4c25ab8cf91b63f497e6f2ff40649710fada9ea2.txt
$ git checkout de0ee392629a40f26837f40899e3e5bd9a24423e
$ cabal clean && cabal configure --enable-benchmarks && cabal build --ghc-options="-O2" && cabal bench 1>bench-report-de0ee392629a40f26837f40899e3e5bd9a24423e-org.txt
$ cat bench-report-de0ee392629a40f26837f40899e3e5bd9a24423e-org.txt | grep -v "unProfile" > bench-report-de0ee392629a40f26837f40899e3e5bd9a24423e.txt
This is not related to our research work in any way, but investigating the exact cause behind this change will definitely yield valuable knowledge of Haskell. Is this exposing a GHC bug? Is this a problem in our benchmarking code?
Here's a bisect log for the mergesort
test:
d74087b302527e5db6364a2e9c155d16a93e52e0 Thu Feb 23 20:09:58 2017
benchmarking Slicer/examples/mergesort.tml
time 248.0 μs (247.8 μs .. 248.2 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 248.0 μs (247.9 μs .. 248.1 μs)
std dev 389.4 ns (294.3 ns .. 531.4 ns)
50b4a6403c0722ee7cc0d91e81360c956eeb8299 Tue Feb 21 16:46:47 2017
benchmarking Slicer/examples/mergesort.tml
time 216.3 μs (215.9 μs .. 216.8 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 216.7 μs (216.1 μs .. 218.4 μs)
std dev 2.877 μs (1.021 μs .. 5.737 μs)
d2d62e627a555a235b65c682e20866b878193ca7 Wed Feb 15 17:49:24 2017
benchmarking Slicer/examples/mergesort.tml
time 204.6 μs (204.2 μs .. 205.0 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 204.9 μs (204.7 μs .. 205.4 μs)
std dev 962.2 ns (477.7 ns .. 1.826 μs)
0b7d5d066fd31da36e14960809ab00cc7c3412b1 Wed Feb 15 12:35:45 2017
benchmarking Slicer/examples/mergesort.tml
time 211.6 μs (211.5 μs .. 211.8 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 211.6 μs (211.5 μs .. 211.7 μs)
std dev 399.1 ns (277.9 ns .. 633.7 ns
fcc42d1e6a2638a458636f067810f486d9102f14 Wed Feb 15 11:15:32 2017
benchmarking Slicer/examples/mergesort.tml
time 208.2 μs (208.0 μs .. 208.6 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 208.1 μs (207.8 μs .. 208.8 μs)
std dev 1.503 μs (674.9 ns .. 2.496 μs)
4c25ab8cf91b63f497e6f2ff40649710fada9ea2 Wed Feb 15 10:38:06 2017
benchmarking Slicer/examples/mergesort.tml
time 226.5 μs (226.1 μs .. 226.9 μs)
1.000 R² (1.000 R² .. 1.000 R²)
mean 227.2 μs (227.0 μs .. 227.6 μs)
std dev 1.042 μs (764.2 ns .. 1.460 μs)
de0ee392629a40f26837f40899e3e5bd9a24423e Tue Feb 14 12:57:38 2017
benchmarking Slicer/examples/mergesort.tml
time 6.376 ms (6.320 ms .. 6.425 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 6.596 ms (6.509 ms .. 6.765 ms)
std dev 356.0 μs (193.0 μs .. 597.6 μs)
variance introduced by outliers: 28% (moderately inflated)
06ca72df2af054fcd09404868c45970dc9163c59 Wed Feb 8 13:39:23 2017
benchmarking Slicer/examples/mergesort.tml
time 6.245 ms (6.186 ms .. 6.303 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 6.454 ms (6.377 ms .. 6.624 ms)
std dev 344.7 μs (169.1 μs .. 636.9 μs)
variance introduced by outliers: 28% (moderately inflated)
Attaching full benchmark logs.
bench-report-4c25ab8cf91b63f497e6f2ff40649710fada9ea2.txt
bench-report-de0ee392629a40f26837f40899e3e5bd9a24423e.txt
When we evaluate a program we essentially convert from expression to a value. But the evaluation of pslice
actually produces an expression. I introduced a new constructor of Value
, VExp
, to store the result of pslice
. The conceptual end effect is that we end up with values and expressions getting mixed.
At the moment we can only visualize an expression that is result of a pslice
, but what else might we want to do with it? Resugaring and pretty printing sounds like an obvious thing to do, which means we need a new language primitive for doing this.
I wonder where all of this leads us to. I feel something is wrong here but I can't find a way to put it into words.
TraceTree
is a module that still is a bit mysterious to me. It looks to me that TraceTree
data type records branching points during execution. This probably means that it should have two new constructors to track whether with
block of try-with was executed or not. Not sure if it should have a new constructor to record that if
condition raised an exception. Help appreciated.
These functions (from Language.Slicer.Monad.Eval
) perform evaluation within extended (withBinder
) or replaced (withEnv
) environment. The intention is to have easy control over scopes, eg. body of a let bindings is evaluated like this:
evalM (Let x e1 e2) = do unbind x
v <- evalM' e1
withBinder x v (evalM' e2)
The intention being that x
scopes only over e2
and once we're done evaluating e2
we revert to previous evaluation state. This is entirely correct for environments but the problem is that with current implementation this discards all the changes made by e2
to evaluation state, including the reference store! This is completely wrong. It will manifest once we have references inside nested scopes. A simple solution is to introduce a bind
function, which is analogous to unbind
. We would then say:
evalM (Let x e1 e2) = do unbind x
v <- evalM' e1
bind x v
v2 <- evalM' e2
unbind x
return v2
That's tedious and error prone. A better solution would be to modify withBinder
and withEnv
to retain changes to reference store but I'm not sure if that's possible. But if that works then we could use withoutBinder
during evaluation (instead of unbind
) in the same way we do it during desugaring - see my fix to #1. We could then say:
evalM (Let x e1 e2) = do v <- withoutBinder x (evalM' e1)
withBinder x v (evalM' e2)
leq (CBool _) (CBool _) = True
leq (CInt i) (CInt j) = i == j
leq (CString i) (CString j) = i == j
That case for CBool
looks wrong. Why the RHS is True
and not b == b'
?
Current definition of TyDecl
in Absyn
is:
data TyDecl = TyDecl
{ name :: TyVar
, constrs :: [(Con, Type)]
} deriving (Show, Eq, Ord)
But it doesn't make sense to have a list of constructors when we only allow to. This is problematic for pattern matching. We should refactor TyDecl
to only allow two constructors. We'll revert to a list if we ever add support for more constructors.
Reference store is an IntMap
, with store labels being an Int
. There are some helper functions that work on a store (deref
, storeInsert
, insertStoreHole
). This is all a bit messy. StoreLabel
should be a newtype, not a type synonym, Store
should be a record with a map and reference counter (currently handled by Eval
monad) and it should be completely abstract.
StoreLabel
a newtypeStoreLabel
instance of ValuableStoreLabels
a setStore
a pair of store and reference countStore
abstraction used in evaluation and repl monads.Having a benchmark suite would be nice. This should be fairly simple to do using criterion
. We should probably expose some functions from the main module, so that we can load a file before running a benchmark and then only measure the time required for parsing, desugaring and evaluation.
Loading this single-line file into REPL works:
let abs = fun abs (x : int) : int => if x > 0 then x else (0 - x) in
The problem is that it shouldn't. Notice the in
at the end of line - these kind of let bindings are not allowed in REPL and should be a parse error. Even more strangely deleting the in
also works, which means both forms of let-bindings are accepted. This is a nice feature, but according to my understanding of the source code this should not work this way. I want to understand why it does work.
name
is a commonly used variable name but it is shadowed by field of TyDecl
record. Need to rename to avoid warnings about name shadowing.
I'm writing the definitions of lub
and leq
for new tracing constructs:
| TRef (Maybe StoreLabel) Trace | TDeref (Maybe StoreLabel) Trace
| TAssign (Maybe StoreLabel) Trace Trace
I wonder what to do with store labels. I think guarding to ensure they are identical is the right thing to do.
TODO:
writes
(Fig. 4)lub
for new language constructspslice
for trace slicingCleanup
pslice
to bwdslice
. Also rename the primitiveVStar
can only be introduced in an environmentThis is a note-to-self ticket.
@jamescheney says:
My suggestion would be to consider making the implementation closer to the
paper by introducing a new type:data Result = RHole | RRet v | RRaise v
and changing eval, forward and backward slicing to produce / consume
Results rather than Values
This might be doable using pattern synonyms.
State
for the Int
counterWriter
to accumulate nodes and edges. It is worth thinking whether it makes sense to use writer. On the one hand we will be using State
already, which allows to accumulate nodes and edges, on the other hand we want don't really care about things in the state, just those in the writer. I think this could be dealt with using a wrapper to run the monad.State
to store current graph colour. This would eliminate the need to pass around the node argumentsTrace
, Exp
and Syntax
. Type classes might help.trace2gvG node edge t = traces2gvG (node, node, node) edge t t
If I say
1 + (2/0)
then this will implicitly raise a "Division by zero" exception. This exception is not recorded in a trace in any way, which makes it impossible for isExn
to spot a raised exception. I wonder when does this become a problem, ie. when does this cause incorrect results. I'm unable to construct any testcase that would expose practical ramifications of this potential bug (possibly due to #47?)
(On branch js-really-separate-asts
)
We don't need MkSyntax
type class - pattern synonyms do the job. Desugar
makes use of functions from that class but this can be easily fixed.
Current parsing of integers is hacky, since we explicitly handle a negation sign. We should be relying on Parsec combinators for this.
visualize("T9.pdf", Cons (1, (Cons (2, Cons (3, Nil)))))
visualizeDiff("T9-diff.pdf", Cons (1, (Cons (2, Cons (3, Nil)))), Cons (1, (Cons (2, Cons (3, Nil)))))
causes
Pattern match failure in do expression at src/Language/Slicer/Desugar.hs:125:8-27
This should be gracefully rejected and not cause a pattern match failure
Per email discussions "Implementing forward slicing and evaluation" (14/02) and ""Writes" question" (20/02) we should not allow holes in source programs. This is not trivial to enforce, because we still have to allow holes in the arguments to built-in operators and technically arguments are allowed to be expressions that should be possible to execute.
At some point during code cleanup I parameterized EvalState
by types of values stored in the environment. This was motivated by code in Annot
module that didn't store Value
s in the environment but used something else instead. Now that Annot
is gone there is no need to have that extra type argument. Code will be shorter and cleaner.
I recall asking about the motivation for having pp_partial
but I can't recall what it was. Do we really need pp_partial
or could we simplify PP
type class to only contain pp
? Is it possible that pp_partial
is a leftover from time when visualize
was not yet implemented?
Roll
and Unroll
constructors are declared as:
| Roll (Maybe TyVar) a | Unroll (Maybe TyVar) a
In what situations can the first field become a Nothing
? I couldn't locate a single instance of such a case in the source code. It seems to always be a Just
.
let t = trace (
let a = ref 1 in
let b = ref 1 in
let c = ref 1 in
b := !b + !a - !c;;
!b) in
bwdSlice (t, 1)
In this example the slice retains initial values for a and c but not b, so there may be a bug in the slicing of assignments.
At the moment the only values that can be raised as exceptions are strings. What we should probably have is an extensible type of exceptions. Implementing this might require rewriting the internal handling of data types.
With refactoring of TraceGraph
complete it is now possible to graph expression trees. That is, visualization functions support that, but it is otherwise impossible because desugaring prevents this during type checking (arguments to visualization functions are required to be traces).
A useful use case for this would be visualization of a sliced program vs. original program.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.