Coder Social home page Coder Social logo

ethereum / act Goto Github PK

View Code? Open in Web Editor NEW
215.0 12.0 36.0 4.26 MB

Smart contract specification language

License: GNU Affero General Public License v3.0

Haskell 81.02% Makefile 1.95% Python 0.73% Nix 2.65% Solidity 2.47% Logos 2.02% Yacc 4.01% Shell 0.06% Coq 4.89% SMT 0.19%

act's Introduction

Act

Act is a formal specification language, designed to allow for the construction of an exhaustive, mathematically rigorous description of evm programs. Act allows diverse toolchains to interoperate on a single specification, with each generating and exchanging different kinds of knowledge. It has a built-in analysis engine that can automatically prove properties about the specification itself, as well as an integrated symbolic execution engine (based on hevm) that can prove equivalence between a specification and a given bytecode object. Finally, specifications can be exported into higher level reasoning tools (e.g. theorem provers, economic analysis tooling), allowing for the verification of properties of almost arbitrary complexity, all with a proof chain right down to the bytecode level.

It extends on the previous Act project.

More in depth documentation can be found in The Act Book.

Building

With nix:

nix build

Developing

Enter a nix-shell to get the dependencies of the project:

nix develop

you can then use cabal as normal:

cd src
cabal build # build
cabal repl  # enter a repl instance

to execute the unit tests:

make test # run all tests
cd src && cabal v2-test # run haskell tests

To update the project dependencies run:

nix flake update

act's People

Contributors

andrevidela avatar d-xo avatar dennisdv24 avatar endorphin avatar kjekac avatar leonardoalt avatar mrchico avatar msooseth avatar xwvvvvwx avatar zoep avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

act's Issues

SMT: Fix encoding for mappings

Currently the SMT backend encodes entries in mappings as individual values instead of as entried within an SMT array. This results in some weird behaviours. For example, the following is not provable:

constructor of C
interface constructor()

creates

   uint a := 0
   mapping(uint=>uint) b := [0 := 1]

invariants

  b[a] == 1

This is not provable because seperate smt variables are produced for a, b[0] and b[a], and although a is constrained to 0, no connection is make between a and b[a].

This should be fixed by encoding b as an SMT array and then setting b[a] and b[0] as the ath and 0th index of that array.

SMT backend env vars type constraints

constructor of AllEnvVars
interface constructor()

creates
	uint value := CALLVALUE

invariants
	value >= 0

I would expect that to be always true based on value's type, but:

$ ./result/bin/act prove --file tests/invariants/pass/all_env_vars.act 

============

Invariant "(AllEnvVars.value >= 0)" of "AllEnvVars": 

Counter example found!

SMTModel {modelObjectives = [], modelBindings = Nothing, modelAssocs = [("AllEnvVars_init_value",0 :: Integer),("AllEnvVars_init_value_post",-1 :: Integer),("AllEnvVars_init_CALLER",0 :: Integer),("AllEnvVars_init_CALLVALUE",-1 :: Integer),("AllEnvVars_init_CALLDEPTH",0 :: Integer),("AllEnvVars_init_ORIGIN",0 :: Integer),("AllEnvVars_init_BLOCKHASH","" :: String),("AllEnvVars_init_BLOCKNUMBER",0 :: Integer),("AllEnvVars_init_DIFFICULTY",0 :: Integer),("AllEnvVars_init_CHAINID",0 :: Integer),("AllEnvVars_init_GASLIMIT",0 :: Integer),("AllEnvVars_init_COINBASE",0 :: Integer),("AllEnvVars_init_TIMESTAMP",0 :: Integer),("AllEnvVars_init_THIS",0 :: Integer),("AllEnvVars_init_NONCE",0 :: Integer)], modelUIFuns = []}

Integers & Bitvectors in Act and bytecode verifying backends

Every arithmetic expression in an act Behaviour is a true integer expression, and can therefore never overflow or underflow.
This makes it easier to express and prove interesting, higher level properties and in mosts cases for developers to accurately capture intended meaning of smart contracts.

However, since the EVM is always dealing with words instead of integers, there are currently two problems arising from this discrepancy:

  1. There are things expressible in act that cannot be implemented in the EVM. If we specify that a storage location changes:
x => x + a

the RHS of the rewrite can be larger than 2^256 and for it to be at all possible for a contract to realize this spec we must further assume x + a. It would be reasonable to add a stage to the act compiler which checks if an overflow is possible for every subexpression in a behaviour, and warn the user if that is the case. Expressions that are topmost terms on the left hand side of a mod should be exempt from this check.

  1. Since the current hevm symbolic implementation deals with bitvectors instead of integers, all act claims have to be translated into bitvectors, which turns out to be quite detrimental for proving any spec containing safemath.

The bottleneck here is in the integer - bitvector conversion in the smt solver. A fairly trivial looking claim like this:

(declare-fun a () (_ BitVec 256))
(declare-fun b () (_ BitVec 256))
(define-fun max () Int 115792089237316195423570985008687907853269984665640564039457584007913129639935)
(define-fun aNat () Int (bv2nat a))
(define-fun bNat () Int (bv2nat b))

(assert (< (+ aNat bNat) max))

(assert (not (= (bvadd a b) ((_ int2bv 256) (+ aNat bNat)))))

(check-sat)

is too difficult for z3 and cvc4 to resolve in a reasonable timeframe.

It appears that the only solution right now to make the hevm backend act compatible is to implement to follow KEVM's suit and implement the EVM logic in terms of integers modulo 2^256 instead of bitvectors of size 256.

Wrong treatment of `iff` conditions

Currently we assume the conjunction of the negation of iff conditions for fail specs, where we should be assuming the negation of their conjunction. Consequently, we can prove specs even though they contain redundant iff clauses:

behaviour add of SafeAdd
interface add(uint256 x, uint256 y)

iff in range uint256

    x + y

iff

    CALLVALUE == 0
    x == 1      /// <- REDUNDANT CONDITION

returns x + y

The spec above is provable against the safemath example in hevm/pass/safemath, but the fail spec should not be.

Unclear semantics for environment variables in the invariant block

The semantics of invariants that refer to environment variables are currently very unclear. For example, does the following mean "value is always set to the CALLVALUE as it was in the constructor" or "value is always set to the CALLVALUE as it is during each method call":

constructor of C
interface constructor()

creates

    address value := CALLVALUE

invariants

    value == CALLVALUE

As currently implemented the meaning is the second meaning (the value of the env var at the time of the method invocation).

After discussing this offline for a while (as well as in this pr), the consensus seems to be that the best approach is to seperate the invariants block from the constructor definition and to have the environment variables refer to the value of the variable during each method call.

It can however sometimes be useful to be able to refer to the constructor arguments within the invariants block, for example:

constructor of C
interface constructor(uint _totalSupply)

creates

    uint totalSupply := _totalSupply

invariants

    totalSupply = _totalSupply

It would therefore perhaps be nice to allow invariants blocks to refer to constructor arguments in some way. Alternatively we may with to investigate adding some kind of pre / post operators allowing us to reformulate the above totalSupply invariant as:

pre(totalSupply) == post(totalSupply)

[Act 0.1] Grammar

We should have an Act 0.1 specific grammar file.
@MrChico I guess this would also have implications on the parser?

Error message about type mismatch

constructor of AllEnvVars
interface constructor()

creates
    uint blockhash := BLOCKHASH
13:32 $ ./result/bin/act type --file tests/frontend/env/all_env_vars.act
Internal error
expected: int, got: bytes

The Act is indeed wrong, uint should be bytes32, but the message could be more detailed.

Unrolling loops with helper functions

When specifying the behaviour of functions that involves a loop, it is sometimes hard to express the post-state on a closed form. A real life example is Pot.rpow, which calculates exponentials of fixed-point integers, or the deposit function of the Eth2.0 registration contract.

Here I propose a syntax which introduces a where construct that I think could handle many such cases. But first, I would like to express the shortcomings of the proposal

In the general case, the result of a loop is difficult to express in a functional language. Especially if it updates storage in unstructured ways. For example, something like

for (i = 0; i < N; i++) {
    assembly {
       sstore(i,i)
   }
}

seems hard to express elegantly in our current setting. However, such contracts are arguably pretty strange anyway.

But in most use cases, when the write-access of a loops aren't determined dynamically, the behaviour of the loop can be expressed as with the help of a recursive, pure function. To allow for such helper functions, I propose to add a new header, where, where they can be defined.

For example, the semantics of:

uint[] zeroHashes;
function zerohashN(uint256 n) { 
bytes32 res = "0x0";
for (i = 0; i <= n; i++;) {
    res += keccak256(res);
}
zeroHashes[n] = s;

is captured by:

behaviour zeroHashN of Arithmetic
interface zeroHashN(uint256 n)

storage

   zeroHashes[n] |-> _ => nth_hash_of(0, n)

where

   nth_hash_of(h, n) = if n == 0 then h else hash(nth_hash_of(h, n - 1))

Basic features

Creating this issue to track the basic features that the spec language should have.

Initial items I and @MrChico came up with:

  • Contract invariants
  • Functions pre and post conditions
  • Blockchain context

Hopefully the result of discussions here will be the agreed basic set of features.

Further Refine AST

The following changes to the AST would proabaly be nice:

  • Add a Contract type that bundles all constructors / methods / invariants / storage declarations for a particular contract. It would be nice if this had some constraints on the constructors (e.g. only one passing constructor).
  • Add a System type that is a collection of related contracts. In the future this could perhaps contain system level invariants

Test: Negative QuickCheck Tests

The current quickcheck tests only produce behaviours that should typecheck. It would be cool to have a generator that produces behaviours that should fail to typecheck.

One possible approach would be to introduce some known bad mutations into the behaviours produced from the current generator. Some ideas:

  1. Multiple writes to the same storage location
  2. Duplicated behaviour names
  3. Duplicated names for storage entries in the constructor definition
  4. Duplicated names for calldata args

Many bad specs are impossible to generate at the level of the Refined AST (e.g. we cannot produce an expression adding a bool to an int), so perhaps in the future we may wish to investigate a generator that produces act as a string directly which would give us a lot more freedom in producing incorrect acts.

State var can be used to initialize itself

constructor of LValue
interface constructor()

creates

    uint x := x
    
invariants

    x == 0

This is accepted by parse/type/prove.

The invariant is false (any expression you put there), since I guess x is unconstrained initially.

I guess this should be disallowed in general?

SMT: Move invariant enrichment to Enrich.hs

The SMT backend has some logic that ensures that the appropriate type constraints are introduced for variables that are mentioned in the invariant, but not in the storage block of the behaviour they are being checked against.

This logic is incomplete (it doesn't account for env vars), and having it intermingled with the rest of the SMT backend is messy. This logic should be moved to Enrich.hs, simplifying implementation in the SMT backend and potentially allowing other backeds to make use of this information.

SMT backend multi transaction counterexample

constructor of C
interface constructor()

creates

    uint x := 0

invariants

    x < 2

behaviour f of C
interface f()

case x == 0:

    storage

        x => 1

behaviour g of C
interface g()

case x == 1:

    storage

        x => 2
============

Invariant "(C.x < 2)" of "C": 



---



---

Counter example found!

SMTModel {modelObjectives = [], modelBindings = Nothing, modelAssocs = [("C_g_x",1 :: Integer),("C_g_x_post",2 :: Integer),("C_g_CALLER",0 :: Integer),("C_g_CALLVALUE",0 :: Integer),("C_g_CALLDEPTH",0 :: Integer),("C_g_ORIGIN",0 :: Integer),("C_g_BLOCKHASH","" :: String),("C_g_BLOCKNUMBER",0 :: Integer),("C_g_DIFFICULTY",0 :: Integer),("C_g_CHAINID",0 :: Integer),("C_g_GASLIMIT",0 :: Integer),("C_g_COINBASE",0 :: Integer),("C_g_TIMESTAMP",0 :: Integer),("C_g_THIS",0 :: Integer),("C_g_NONCE",0 :: Integer)], modelUIFuns = []}

The counterexample is correct, but not complete. It tells us one step to reach x_post = 2 from x_pre = 1, but I'm wondering if we can get the whole trace.
I think one possible way is by running extra queries:

  1. Get the cex above, g() makes x_pre = 1 -> x_post = 2
  2. Ask how to get to x_post = 1, which gives us f() makes x_pre = 0 -> x_post = 1.
  3. Ask how to get to x_post = 0, which gives us init() (or similar) makes x_post = 0

Container abstractions: arrays, mappings, ....

In many cases, we want specifications to be expressed in terms of the various containers present in the source language. Such containers include at least the following:

  • Arrays
  • Mappings
    (n.b. that an Array can be represented as a mapping whose keys are integers and with an additional entry keeping it's length)
    but future version may also include abstractions for structs / tuples (products) and enums / abstract data types (coproducts) of appropriate types.

Of course, there are lots of possibilities here as far as syntax is concerned, but generally, we want to support at least:

  • map (a.k.a. fmap), applying a function to all the containers values.
  • reduce (a.k.a. fold, accumulate...), applying a function to a container recursively, aggregating the result.

Some conveniences like sum, length and slice would also be nice.

To keep things practical, I suggest we only consider containers that are supported in the ABIEncoder for now.
For specificity, I'll keep a list below of supported HLL functions. In the following, container(A) denotes a container (array or mapping), whose type of values is A.

  • map : container(A) -> (A -> B) -> container(B)
  • reduce: container(A) -> (A -> A -> A) -> A
  • Lambda abstractions
  • Conveniences:
    length, sum, product, slice, reduce (overloaded to also take index)

Unify Error Handling

The parser & typechecker currently make use of the Err monad from ErrM. The analysis backends currently have no consistent error handling approach and mostly just call error if something unexpected happens.

We should generalise the Err monad so that it can be used by the various backends and rework them so that functions that can fail return an Err.

Lack of blanks makes source look confusing

constructor of LValue
interface constructor()

creates
	uint x := 0
	uint y := 2

invariants
	y == 2

behaviour f of LValue
interface f()

storage
	x => 0y => 2

This works all the way to prove, but x => 0y => 2 itself looks a bit confusing to me.
Maybe we should force separators?

Type errors for undeclared storage references

The following spec is currently accepted by the typechecker:

constructor of Env
interface constructor()

creates

    uint value := 1

behaviour f of Env
interface f()

returns 1

ensures

    value == 1

This spec is incorrect because value is not mentioned in a storage block in f, and attempts to prove the postcondition will fail as no storage variables is declared for value. The typechecker should ensure that each variable referenced in a behaviour is also mentioned in the storage block for that behaviour.

Pretty Print Counterexamples

The counterexamples are currently not pretty printed in any way, and look very messy. For exmaple for this spec:


Invariant "(C.x < 9)" of "C":



---



---



---

Counter example found!

SMTModel {modelObjectives = [], modelBindings = Nothing, modelAssocs = [("C_j_x",7 :: Integer),("C_j_x_post",100 :: Integer),("C_j_CALLER",0 :: Integer),("C_j_CALLVALUE",0 :: Integer),("C_j_CALLDEPTH",0 :: Integer),("C_j_ORIGIN",0 :: Integer),("C_j_BLOCKHASH","" :: String),("C_j_BLOCKNUMBER",0 :: Integer),("C_j_DIFFICULTY",0 :: Integer),("C_j_CHAINID",0 :: Integer),("C_j_GASLIMIT",0 :: Integer),("C_j_COINBASE",0 :: Integer),("C_j_TIMESTAMP",0 :: Integer),("C_j_THIS",0 :: Integer),("C_j_NONCE",0 :: Integer)], modelUIFuns = []}

We should tidy this up, perhaps something like the following:

Invariant C.x < 9 of C:

  Counterexample:

    prestate:

      calldata:

        j()
      
      storage:

        x = 7

      environment:

        THIS      = 0
        NONCE     = 0
        ORIGIN    = 0
        CHAINID   = 0
        GASLIMIT  = 0
        COINBASE  = 0
        TIMESTAMP = 0
        CALLVALUE = 0
        CALLDEPTH = 0

    poststate:

      x = 100

Internal error when type checking Real -> Int assignment

constructor of LValue
interface constructor()

creates
	uint x := 2.5

->

act: internal error: infer type of:Zoom (AlexPn 67 5 20) (IntLit 2) (IntLit 5)
CallStack (from HasCallStack):
  error, called at Type.hs:363:10 in main:Type

SMT checks of spec validity

Even after a simple type checking, there are a number of ways in which specs can be malformed, possibly leading to crashes in different prover back ends.

There are a few checks that I think can be implemented quite straightforwardly with the assistance of an SMT checker, for example using the sbv library:

  • Bound checking of expressions
    Can the expression assigned to a particular type fall out of the bounds of that type? Throw warning to suggest that the expression be further constrained to be bound.

  • Case exhaustiveness / redundancy
    Throw warning if cases are not exhaustive, or if a case is redundant

  • Array / mapping collisions
    In the classic token example:

     balanceOf[CALLER] => balanceOf[CALLER] - value
     balanceOf[to]     => balanceOf[to] + value

if CALLER == to, the same expression has been assigned to distinct values, and the spec doesn't make any sense. Specs like this should not be accepted.

Define Language Scope for 0.1

This issues defines the language features in the scope of Act 0.1.

I'll start this comment with my suggestions, and we can update after discussions.

Types

uint8, uint16, ... uint256, uint
int8, int16, ... int256, int
bytes1, ... bytes32
address
bool
string

Sections

behaviour
interface
creates
invariants
case
storage
if
iff in range
iff
ensures

Env vars

CALLER
CALLVALUE
CALLDEPTH
ORIGIN
BLOCKHASH
BLOCKNUMBER
DIFFICULTY
CHAINID
GASLIMIT
COINBASE
TIMESTAMP
THIS
NONCE

Operators

:= (creates assignment)
=> implies or storage update
== eq
=/= neq
>= <= > < ite
+ - * / ^

Negative Integer Literals

The current implementation fails with a parse error on the following spec:

constructor of C
interface constructor()

creates

    int totalSupply := -1

We should implement negative integer literals.

Typechecker: Issue type errors for invalid constructor preconditions

The following spec is accepted by the typechecker even though it makes no sense (there is no prestate when the constructor is running):

constructor of C
interface constructor()

iff

    x == 7

creates

    uint x := 2

We should modify the typechecker to reject specs containing storage references in constructor precondition blocks.

nix fails to fetch dapptools

Running nix-shell or nix-build on master, nix fails to fetch dapptools:

fatal: not a tree object: b74e38ea56ec25f181dbd5b026437c22b581dc8f
error: program 'git' failed with exit code 128

dapphub/dapptools@b74e38e is clearly a valid commit hash, so I'm not sure what the issue is.

Seperate SMT Analysis Stages

The SMT backend currently performs two checks:

  1. Invariants hold
  2. Postconditions are implied by preconditions

These checks are currenlty performed in a single pass, and we currenlty force the second to be run for all specs by inserting an implicit true invariant on all contracts. This is pretty hacky and makes the smt output a little messy (as the implicit invariant shows up in the output). This also fails if the spec does not include a construcor.

We should rework the SMT backend to work in multiple passes. This will lay the foundation for future analyis stages (e.g. exhaustiveness, bounds checking), and make generating nice output from the smt backend easier.

Loop invariants

@wilcoxjay and I talked today, and he suggested the idea of also expressing loop invariants in the spec.

The challenge there is that loops don't have an interface, so how can we talk about them inside a function in the spec such that they represent loop behavior in different levels, eg, Solidity / bytecode?

One use case is, for example, a tool that runs on the Solidity level inferring loop invariants and trying to communicate these properties to tools that target bytecode.

An idea would be to have the basic spec level of abstraction similar to the ABI, and have extensions to that, such as spec that can also reason about bytecode.

SMT: further simplify exponentiation expressions

The SMT backend currently performs some very simple preprocessing of expressions involving exponentation (see here). These could be extended to allow the SMT encoding of any exponentiation where the RHS is concrete, by either:

  • fully evaluating the result of the exponentation if both the RHS & LHS are concrete
  • encoding the exponentiation as a chain of multiplications if the RHS is concrete, but the LHS is symbolic

We will need to write a simple evaluator for act expressions to do this properly.

Test: JSON Roundtrip

It would be cool to have some quickcheck tests that:

  1. Generate some behaviours
  2. Write those behaviours to JSON
  3. Parse the JSON into a behaviour
  4. Assert equality of the parse behaviour to the original

To do this we would need to implement FromJSON instances in RefinedAST, which adds some complexity, but it probably required anyway if we want to allow external tools to programatically generate act.

Fail specs should not contain stateUpdates

fail specs currently contain state updates, but they should instead leave the state exactly as it was before. e.g:

"stateUpdates": [
{
"Right": {
"location": {
"arity": 3,
"args": [
"Token",
"balanceOf",
[
{
"expression": "dst",
"sort": "int"
}
]
],
"symbol": "lookup"
},
"value": {
"arity": 2,
"args": [
{
"arity": 3,
"args": [
"Token",
"balanceOf",
[
{
"expression": "dst",
"sort": "int"
}
]
],
"symbol": "lookup"
},
"amount"
],
"symbol": "+"
}
}
},
{
"Right": {
"location": {
"arity": 3,
"args": [
"Token",
"balanceOf",
[
{
"expression": "src",
"sort": "int"
}
]
],
"symbol": "lookup"
},
"value": {
"arity": 2,
"args": [
{
"arity": 3,
"args": [
"Token",
"balanceOf",
[
{
"expression": "src",
"sort": "int"
}
]
],
"symbol": "lookup"
},
"amount"
],
"symbol": "-"
}
}
},
{
"Right": {
"location": {
"arity": 3,
"args": [
"Token",
"allowance",
[
{
"expression": "Caller",
"sort": "int"
},
{
"expression": "src",
"sort": "int"
}
]
],
"symbol": "lookup"
},
"value": {
"arity": 2,
"args": [
{
"arity": 3,
"args": [
"Token",
"allowance",
[
{
"expression": "Caller",
"sort": "int"
},
{
"expression": "src",
"sort": "int"
}
]
],
"symbol": "lookup"
},
"amount"
],
"symbol": "-"
}
}
}
],
"kind": "Behaviour",
"mode": "Fail",
"creation": false,
"name": "transferFrom",

SMT: Destringify Mapping Keys

The SMT backend currently stores storage vars, environment vars, and calldata vars in mappings that are keyed by strings. The key for a particular variable is determined by running a pretty printing routine.

This is a little messy and makes pretty printing of the counter examples difficult (we would have to parse these key names). This can be tidied by:

  1. Adding an Ord instance for the relevant pieces of the AST
  2. Changing the types of the mapping keys to be the concrete varaibles in question (e.g. the mapping for the storage vars would be TEntry -> SMType).

This should also make pretty printing the counter examples much easier as we could easily reuse the existing (well tested) pretty printer in Print.hs.

Error in `prove` when assigning a state variable to another

constructor of LValue
interface constructor()

creates
	uint x := 0
	uint y := 2

invariants
	x == 0

behaviour f of LValue
interface f()

storage
	x => y

Not sure if this is supposed to work, but:

act: "LValue_f_y" not found in fromList [("LValue_f_x",<symbolic> :: SInteger)]
CallStack (from HasCallStack):
  error, called at Prove.hs:457:28 in main:Prove

Internal error while parsing invalid code

constructor of LValue
interface constructor()

creates

	uint x
constructor of LValue
interface constructor()

creates

	uint

both lead to

Internal error
no valid tokens

Internal error when assining `_` to variable

constructor of LValue
interface constructor()

creates
	uint x := _

->

act: internal error: infer type of:WildExp
CallStack (from HasCallStack):
  error, called at Type.hs:363:10 in main:Type

Invariants ignored by `prove`

constructor of LValue
interface constructor()

creates
	uint x := 0
	uint y := 2

invariants
	x == 0
	y == 2

behaviour f of LValue
interface f()

storage
	x => 0
	y => 2

Also not sure what the expected behavior here is, but I get:

$ ./result/bin/act prove --file lvalue.act

============

Invariant "(LValue.x == 0)" of "LValue": Q.E.D

Shouldn't y == 2 also be proved? Or am I missing something?

Ambiguous parse for read only storage blocks

Consider the following spec:

behaviour f of C
interface f()

iff in range uint256

    x + y

storage

    x 
    y

returns x + y

The storage block is currently parsed into:

[ExtStorage "x" [Constant (Entry (AlexPn 168 19 5) "y" [])]]

When it should be parsed into:

[Constant (Entry (AlexPn 162 18 5) "x" []),Constant (Entry (AlexPn 173 19 5) "y" [])]

Perhaps we need to make the parser whitespace aware, or introduce some delimiters?

Introduce & enforce pre / post operators

After some thought and discussion in #92, we have decided to introduce pre and post operators for storage references and also to enforce their use for all storage references.

While this is somewhat more verbose, it has the benefit of disambiguating the specification language, and should also help to simplify analysis based on the AST, since this metadata no longer has to be infered from the surrounding context (i.e. each storage ref in the AST will have a data field declaring whether it refers to the pre or post state).

Position information for backend errors

Parse / typecheck errors have nice position location, and can be pretty printed, however the position information is not passed down the chain pass the typechecker. This means that errors in the SMT backend (e.g. here), cannot provide a nice pretty printed output showing the user the source of the issue.

I'm not sure what the best way to deal with this is, dragging around the position information in the refined AST seems overly clunky. @kjekac @MrChico do you have any suggestions?

Decide on Semantics for Variable References in the Ensures Block

As currently implemented, storage references in the ensures blocks refer to the value of that variable in the prestate. This semantics was chosen to keep the meaning of an expression consistent throughout the behaviour. This does however allow the construction of some perhaps surprising behavioiurs.

For example the following spec will fail when run with act prove:

behaviour f of C
interface f()

storage

  x => 2

ensures

  x == 2

This is because in the expression in the ensures block, x is refering to the value of that storage slot in the prestate. While a consistent semantics for expressions throughout a behaviour is a nice property, the current situation is quite unintuitive. I can see a few options:

  1. Change the semantics of the ensures block to make storage references refer to the poststate
  2. Introduce syntax that allows users to explicitly refer to the poststate (e.g. post(x))
  3. Keep the current semantics, and instead allow users to define per spec variables (in a where clause). This would allow specs to look something like the following:
behaviour f of C
interface f(uint a)

storage

  x => xPost

ensures

  xPost >= 2

where

  xPost := 2 + a * 5

@MrChico @leonardoalt interested to hear your thoughts here.

SMT: Remove Local Prestate for Constructor Queries

We currently generate both a pre and a post state for storage variables in constructor queries. This makes no sense for variables local to the contract since they don't exist at that point.

We should rework the SMT backend so that no constructor prestate is generated.

Polymorphic Equality

The equality operator is currently only defined over integers (ref). We should allow Eq over at least booleans as well.

Specifying behaviour with calls to unknown code

A common occurrence in smart contracts are calls to contracts with unknown code, whether for executing a simple eth payment, proxy/multisig actions or delegating control for a flash loan.

Such behaviour is naturally hard to specify, since the entire point is that these functions admit unspecified behaviour. Nevertheless, we want to be able to specify exactly what these functions do with respect to the domain for which their behaviour is defined, and express that beyond this domain, anything is allowed.

As an example, if we have a function with a flash loan, we have a clear expectation on the poststate of the contract itself, expressing that the loan need be paid back, while allowing any other part of the world state to change in ways we do not know.

Here's a suggested syntax for making such claims:

behaviour flash of A
interface flash(address flasher, uint amount)

iff in range uint256
 
       token.balanceOf[A] + flashFee

storage A.token
    balanceOf[A] => balanceOf[A] + A.flashFee

storage _
    _

the meaning being that a call to A.flash(flasher, amount) will increment the A.token balance of A by whatever the storage variable flashFee is set to, and that, for any contract that isn't A or A.token, arbitrary changes to storage may occur.

In other words, the syntax

storage _
_

would be shorthand for something along the lines for "for all addresses X that are not the main contract of the spec nor any of the other contracts mentioned to in the spec, the storage contents of X can change arbitrarily".

Before I get to how to prove claims like this, let me say some things about the syntax.

First of all, I think we need a more general keyword than storage, since nonces and balances may also change, new addresses can be created, and so on. state is an obvious candidate that was also brought up before. Can every instance of storage be changed into state? Should storage be a subheader of state?

If we continue with the EVM opcodes in CAPS idea, then maybe just generalizing storage to state and allow for references to BALANCE and NONCE under this header could be nice:

behaviour f of A
interface f()

state
   x => x + 1
   BALANCE => BALANCE + 1

Second of all, it needs to clarified what a "contract mentioned in the spec" means. Every contract with a storage/state header present? In the current syntax, we can actually read and refer to variables of other contracts without introducing a storage header for them. Should we allow storage writes to read-only contracts? I'm don't think so... Can this be made clearer in the syntax?

Lastly, I would like to make a note on proving strategies for these types of specs. This is a crude write up and I apologize if it's hard to follow, but I needed to get this down. Refinements can come later.

In order to determine which branches to explore, we first need to establish a functional dependency for all of the specs of all of the "contracts mentioned in the spec" for which behaviour is defined. A spec S is functionally dependent on all the specs which can write to the reads of S.

Now, when encountering a call to an unknown contract, we branch and explore:

  • The case in which the external call does not enter any of the "contracts mentioned in the spec", and we simply carry on with exploring the rest of the function after the call.
  • For every "contract mentioned in the spec" and any spec which can modify state, apply the spec and notice which storage items can change (the changeset). We now branch again into
    • the case where the external call is done and we are back in the main function with all the storage items in the changeset set to arbitrary values.
    • explore any of the specs again, now with any storage item in the changeset set to an arbitrary value. If more variables can change, add them to the changeset and repeat. If the changeset has not grown, we do not need to continue looping here.

If vs Case: globalizing pre- & postconditions

In the current version of act, the two headers if and iff have the following semantics:

The if header specifies preconditions that need to be met in order for the specification to be defined. Whatever happens in the case where the preconditions are not met is considered outside the scope of the current specification and must be defined elsewhere in order for the specification to be complete.

The iff header on the other hand, asserts that the function at hand will behave according to the specification when the conditions are met, but also asserts that it must REVERT otherwise. In other words, specifications with an iff header actually result in two proof obligations, a pass case where the assumptions are met, and a fail case where they are not.

In images:
Screen Shot 2019-11-25 at 17 21 36

Here is an alternative approach which roughly changes the if header to a case pattern, allowing the remaining headers to still apply across all cases. Here's a conceptual example:

behaviour transfer of Token
interface transfer(uint256 value, address to)

iff in range uint256

     balanceOf[CALLER] - value

iff

     CALLVALUE == 0
     to =/= CALLER => balanceOf[to] + value < 2^256

case to =/= CALLER:
storage
     balanceOf[CALLER] := balanceOf[CALLER] - value
     balanceOf[to]     := balanceOf[to] + value

case _:

     noop

end

Fixed point types

Almost all real world smart contracts make use of fixed point arithmetic in some form. Current limitations in Solidity mean that these operations have to be carried out directly on integers, a fiddly and error prone procedure that often results in bugs (most recently causing the failure of YAM). It would be nice if we could use act to verify the correctness of these calculations.

Fixed point types are already an atomic abi type, but operations on them are not supported in solidity. Would it make sense to reuse these types in act? They are represented in the abi encoding as simple ints or uints depending on their signedness, but the name of the type is used when calculating the function selector.

Since these types are not currently useful in solidity smart contracts, any solidity implementation would have to use uints or ints directly, including in the interface, meaning that we would probably have to implicitly convert the types to their underlying representation (int/uint) when computing function selectors and the like in bytecode proofs, which feels a little dirty...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.