Mi-Cho-Coq

This repository is a formalisation of the Michelson language using the Coq interactive theorem prover.

Installation instructions

The simplest way to install the project is to use the OPAM package manager:

# Add the required opam repositories for Mi-Cho-Coq and its dependencies
opam repo add coq-released https://coq.inria.fr/opam/released
opam repo add coq-extra-dev https://coq.inria.fr/opam/extra-dev

# Install mi-cho-coq and its dependencies
opam install coq-mi-cho-coq

Introduction

The following parts of the language are formalized:

the type system
the syntax
the semantics
the lexing
the parsing

This formalisation is entirely based on the specification of Michelson.

Syntax

Type system

In Michelson, an important distinction is made between comparable types and non-comparable ones. Comparable types can be used to index maps and only sets of elements in comparable types are allowed.

Here is the list of comparable types defined in ./src/michocoq/syntax_type.v:

Inductive simple_comparable_type : Set :=
| string
| nat
| int
| bytes
| bool
| mutez
| address
| key_hash
| timestamp.

Inductive comparable_type : Set :=
| Comparable_type_simple : simple_comparable_type -> comparable_type
| Cpair : simple_comparable_type -> comparable_type -> comparable_type.

Each comparable type is trivially a Michelson type, here is the full list of Michelson types as defined in ./src/michocoq/syntax_type.v:

Inductive type : Set :=
| Comparable_type (_ : simple_comparable_type)
| key
| unit
| signature
| option (a : type)
| list (a : type)
| set (a : comparable_type)
| contract (a : type)
| operation
| pair (a : type) (b : type)
| or (a : type) (_ : annot_o) (b : type) (_ : annot_o)
| lambda (a b : type)
| map (k : comparable_type) (v : type)
| big_map (k : comparable_type) (v : type)
| chain_id.

Coq lets us see the type simple_comparable_type as a subtype of comparable_type and comparable_type as a subtype of the type type using implicit coercions.

Instructions

The syntax and typing of Michelson instructions is defined in file ./src/michocoq/syntax.v. We use a dependent inductive type to rule out ill-typed instructions.

Inductive instruction :
  forall (self_type : Datatypes.option type) (tail_fail_flag : Datatypes.bool) (A B : Datatypes.list type), Set :=
| NOOP {A} : instruction A A
| FAILWITH {A B a} : instruction (a ::: A) B
| SEQ {A B C} : instruction A B -> instruction B C -> instruction A C
| IF_ {A B} : instruction A B -> instruction A B -> instruction (bool ::: A) B
| LOOP {A} : instruction A (bool ::: A) -> instruction (bool ::: A) A
...

The self_type index represents the type of the parameter of the whole contract, this index is needed for typing SELF, CREATE_CONTRACT and LAMBDA correctly.

The tail_fail_flag index is a flag indicating whether or not the instruction output type can be inferred from its input type. Most instructions have this flag set to false meaning that the type of their input determines the type of their output. FAILWITH, sequences ending in FAILWITH, and branching instructions (such as IF) with all branches ending in FAILWITH have their tail_fail_flag set to true. Instruction sequencing { A ; B } (also written SEQ A B) requires that A has its tail_fail_flag set to false. This restriction is also present in the reference Michelson typechecker; it permits efficient typechecking in a single pass through the contract code.

Finally the A and B indexes are the types of respectively the input and output stacks of the instruction.

Semantics

The semantics of types is defined by interpreting them by predefined Coq types using the following table (functions comparable_data in ./src/michocoq/comparable.v data in file ./src/michocoq/semantics.v:

Michelson	Coq
`int`	`Z`
`nat`	`N`
`string`	`string`
`bytes`	`string`
`timestamp`	`Z`
`mutez`	`int63`
`bool`	`bool`
`unit`	`unit`
`pair a b`	`a * b`
`option a`	`option a`
`or a b`	`sum a b`
`list a`	`list a`
`set a`	`set a (lt a)`
`map a b`	`map a b (lt a)`
`bigmap a b`	idem
`lambda a b`	`{tff & instruction None tff [a] [b]}`
anything else	axiomatized

with

Definition int63 :=
  {t : int64.int64 | int64.sign t = false}
Definition set a lt :=
  {l : list A | Sorted.StronglySorted lt l}
Definition map a b lt :=
  set (a * b) (fun x y => lt (fst x) (fst y))

We formalize the semantics of Michelson by defining an evaluator in file ./src/michocoq/semantics.v. Since evaluating a Michelson instruction might fail (which Coq functions cannot), we wrap the return type of this evaluator in the exception monad defined in file ./src/michocoq/error.v. Moreover, Coq forbids non-terminating function so we use a common Coq trick to define the evalator on diverging instructions such as LOOP: we make the evaluator structurally recursive on an extra argument of type Datatypes.nat called the fuel of the evaluator.

Fixpoint eval {self_type} {tff} {env} {A : stack_type} {B : stack_type}
         (i : instruction self_type tff A B) (fuel : Datatypes.nat) {struct fuel} :
  stack A -> M (stack B) :=
  match fuel with
  | O => fun SA => Failed _ Out_of_fuel
  | S n =>
    match i, SA, env with
    ...

The env parameter to eval is a record providing the semantics of the following instructions:

forging operations: CREATE_CONTRACT, SET_DELEGATE, and TRANSFER_TOKENS: these cannot be defined directly because the operation type is axiomatized.
requesting information about the context: AMOUNT, BALANCE, CHAIN_ID, NOW, SELF, SENDER, and SOURCE
cryptography: HASH_KEY, BLAKE2B, SHA256, SHA512, and CHECK_SIGNATURE (not yet formalized)
serialization: PACK and UNPACK (not yet formalized)
manipulation of addresses: ADDRESS, CONTRACT, and IMPLICIT_ACCOUNT (not yet formalized)

Overloading

A Michelson instruction is called overloaded when it can be assigned several types. For example, the NEG instruction which replaces a number at the top of the stack by its opposite can have either the type int : 'S -> int : 'S or the type nat : 'S -> int : 'S.

To handle this source of ambiguity in our typed AST instruction, we use canonical structures, a Coq feature that let the Coq refiner solve the ambiguity by inferring the missing piece of information from the context.

Both versions of the NEG instruction are defined in ./src/michocoq/syntax.v as follows:

(* NEG takes either a nat or an int as argument *)
Inductive neg_variant : type -> Set :=
| Neg_variant_nat : neg_variant nat
| Neg_variant_int : neg_variant int.
Structure neg_struct (a : type) := Mk_neg { neg_variant_field : neg_variant a }.
Canonical Structure neg_nat : neg_struct nat :=
  {| neg_variant_field := Neg_variant_nat |}.
Canonical Structure neg_int : neg_struct int :=
  {| neg_variant_field := Neg_variant_int |}.

The structure neg_struct is then used in the instruction datatype as follows:

Inductive instruction : ... :=
...
| NEG {n} {s : neg_struct n} {S} : instruction (n ::: S) (int ::: S)
...

so the NEG instruction receives a neg_struct as an implicit argument which is going to be provided by the canonical structure mechanism.

Finally, the evaluator in file ./src/michocoq/semantics.v uses this implicit argument to call the correct function depending on the chosen type for the NEG instruction:

Definition neg a (v : neg_variant a) : data a -> data int :=
  match v with
  | Neg_variant_nat => fun x => (- Z.of_N x)%Z
  | Neg_variant_int => fun x => (- x)%Z
  end.

...

Fixpoint eval ... :=
  match fuel with
  | O => fun SA => Failed _ Out_of_fuel
  | S n =>
    match i, SA, env with
    ...
    | @NEG _ _ s, (x, SA), _ => Return (neg _ (neg_variant_field _ s) x, SA)
    ...

Typechecking

An untyped version of the instruction inductive type is provided in file ./src/michocoq/untyped_syntax.v. Type checking and type inference functions are defined in file ./src/michocoq/typer.v. They are proved correct in file ./src/michocoq/untyper.v by proving they are inverses of the trivial type-erasure function called untype_instruction.

Micheline lexing, parsing, and pretty-printing

A Micheline parser is defined using Menhir in file ./src/michocoq/micheline_parser.vy; its input is streams of Micheline tokens that can be produced from a mere string using the lexer in ./src/michocoq/micheline_lexer.v. Micheline annotations are currently not supported.

To go in the other direction, pretty-printing functions are defined in file ./src/michocoq/micheline_pp.v.

Finally, the bridge between parsed Micheline nodes and untyped instructions is defined in files ./src/michocoq/micheline2michelson.v (that also unfolds macros) and ./src/michocoq/michelson2micheline.v.

Functional verification of smart contracts

The ./src/contracts_coq/ directory contains some examples of verified Michelson smart contracts.

./src/contracts_coq/boomerang.v is a simple example of a contract that sends back whatever amount is receives to the address that called it. It can be used for example to test a wallet’s ability to send funds without taking the risk to lose them.
./src/contracts_coq/deposit.v is an example of address-based authentication. The contract stores the address of its manager (that can either be an implicit account or a smart contract with a parameter of type unit). This contract has two entrypoints: the one on the left does nothing and can be called by anybody, it can be used to send tokens to the deposit contract; the one on the right can only be called by the manager, it takes an amount in parameter and sends to the manager the requested amount.
./src/contracts_coq/manager.v is the default smart contract that was originated in place of all scriptless KT1 accounts during the Babylon migration. Because of this, it is an extremely common smart contract on the Tezos blockchain. It is quite similar to the deposit contract; it stores the address of its manager (that has to be an implicit account) and has two entrypoints, the default entrypoint does nothing and can be used to send tokens to the contract from anywhere; the other entrypoint, called %do can only be called by the manager. There are two differences with the deposit contract in the behaviour of the %do entrypoint: first, it checks that no tokens was sent (the default entrypoint should be used for that), second its parameter is not simply an amount to send to the manager but a piece of code (of type lambda unit (list operation)) to execute and emit as a result. This provides more genericity as it permits for example to change the delegate of the contract, to call other smart contract expecting a parameter, and to perform several transfers atomically.
./src/contracts_coq/multisig.v is the multisig contract supported by tezos-client. A multisig contract is a smart contract that can be used to share ownership over a bag of tokens. It is an example of signature-based authentication. The contract stores a list of public keys that represent the owners of the contract, a number called the threshold and an anti-replay counter. It features three possible actions:
- spending the contract tokens (by sending them to a contract unit)
- updating the delegate (including stopping delegation)
- updating both the threshold and the keys of the owners
To run any of these actions, the following data has to be sent to the contract:
- the current value of the anti-replay counter
- the description of the action
- a list of optional signatures
The signatures must be given in the same order as the stored keys, all provided signature must be valid and the number of provided signatures must be greater or equal to the threshold. The signed data contains the chain_id of the chain running the contract, the address of the contract, the anti-replay counter, and the description of the action.
./src/contracts_coq/generic_multisig.v is to the multisig contract what the manager contract is to the deposit contract: it features a universally-callable default entrypoint that does nothing (to receive tokens), it checks that no token is sent on the main entrypoint, and it generalizes the spending and delegation actions using a piece of code of type lambda unit (list operation).
./src/contracts_coq/vote.v is an example of a contract manipulating a map and implementing a paywall. It is a vote contract that stores a map from the available vote options (represented as strings) to their tallies. To vote for one of the available options, a transfer of 5tz (or more) to the contract is needed.

Ott semantics

See ./src/michocott/README.txt.

spruceid / mi-cho-coq Goto Github PK

mi-cho-coq's Introduction

Mi-Cho-Coq

Installation instructions

Introduction

Syntax

Type system

Instructions

Semantics

Overloading

Typechecking

Micheline lexing, parsing, and pretty-printing

Functional verification of smart contracts

Ott semantics

mi-cho-coq's People

Contributors

Stargazers

Watchers

mi-cho-coq's Issues

Recommend Projects

Recommend Topics

Recommend Org