Coder Social home page Coder Social logo

weiroll's Introduction

Weiroll

Weiroll is a simple and efficient operation-chaining/scripting language for the EVM.

Overview

The input to the Weiroll VM is an array of commands and an array of state variables. The Weiroll VM executes the list of commands from start to finish. There is no built-in branching or looping, though these can be added externally.

State elements are bytes values of arbitrary length. The VM supports up to 127 state elements.

Commands are bytes32 values that encode a single operation for the VM to take. Each operation consists of taking zero or more state elements and using them to call (via delegatecall) a smart contract function specified in the command. The return value(s) of the function are then unpacked back into the state.

This simple architecture makes it possible for the output of one operation to be used as an input to any other, as well as allowing static values to be supplied by specifying them as part of the initial state.

Command structure

Each command is a bytes32 containing the following fields (MSB first):

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
┌───────┬─┬───────────┬─┬───────────────────────────────────────┐
│  sel  │f│    in     │o│              target                   │
└───────┴─┴───────────┴─┴───────────────────────────────────────┘
  • sel is the 4-byte function selector to call
  • f is a flags byte that specifies calltype, and whether this is an extended command
  • in is an array of 1-byte argument specifications described below, for the input arguments
  • o is the 1-byte argument specification described below, for the return value
  • target is the address to call

Flags

The 1-byte flags argument f has the following field structure:

  0   1   2   3   4   5   6   7
┌───┬───┬───────────────┬────────┐
│tup│ext│   reserved    │calltype│
└───┴───┴───────────────┴────────┘

If tup is set, the return for this command will be assigned to the state slot directly, without any attempt at processing or decoding.

The ext bit signifies that this is an extended command, and as such the next command should be treated as 32-byte in list of indices, rather than the 6-byte list in the packed command struct.

Bits 2-5 are reserved for future use.

The 2-bit calltype is treated as a uint16 that specifies the type of call. The value that selects the corresponding call type is described in the table below:

   ┌──────┬───────────────────┐
   │ 0x00 │  DELEGATECALL     │
   ├──────┼───────────────────┤
   │ 0x01 │  CALL             │
   ├──────┼───────────────────┤
   │ 0x02 │  STATICCALL       │
   ├──────┼───────────────────┤
   │ 0x03 │  CALL with value  │
   └──────┴───────────────────┘

If calltype equals CALL with value, then the first argument in the in input list is taken to be the amount of ETH that will be supplied to the call, and the rest of the arguments are the arguments to the called function, both processed as described below.

Input/output list (in/o) format

Each 1-byte argument specifier value describes how each input or output argument should be treated, and has the following fields (MSB first):

  0   1   2   3   4   5   6   7
┌───┬───────────────────────────┐
│var│           idx             │
└───┴───────────────────────────┘

The var flag indicates if the indexed value should be treated as fixed- or variable-length. If var == 0b0, the argument is fixed-length, and idx, is treated as the index into the state array at which the value is located. The state entry at that index must be exactly 32 bytes long.

If var == 0b10000000, the indexed value is treated as variable-length, and idx is treated as the index into the state array at which the value is located. The value must be a multiple of 32 bytes long.

The vm handles the "head" part of ABI-encoding and decoding for variable-length values, so the state elements for these should be the "tail" part of the encoding - for example, a string encodes as a 32 byte length field followed by the string data, padded to a 32-byte boundary, and an array of uints is a 32 byte count followed by the concatenation of all the uints.

There are two special values idx can equal to which modify the encoder behavior, specified in the below table:

   ┌──────┬───────────────────┐
   │ 0xfe │  USE_STATE        │
   ├──────┼───────────────────┤
   │ 0xff │  END_OF_ARGS      │
   └──────┴───────────────────┘

If idx equals USE_STATE inside of an in list byte, then the parameter at that position is constructed by feeding the entire state array into abi.encode and passing it to the function as a single argument. If it's specified as part of the o output target, then the output of that command is written directly to the state instead via abi.decode.

The special idx value END_OF_ARGS indicates the end of the parameter list, no encoding action will be taken, and all further bytes in the list will be ignored. If the first byte in the input list is END_OF_ARGS, then the function will be called with no parameters. If o equals END_OF_ARGS, then it specifies that the command's return should be ignored.

Examples

Fixed length input and output values

Suppose you want to construct a command to call the following function:

function add(uint a, uint b) external returns (uint);

sel should be set to the function selector for this function, and target to the address of the deployed contract containing this function.

f should specify this is a delegatecall (0x00), in needs to specify two input values of fixed length (var == 0b0). The remaining four input parameters are unneeded and should be set to 0xff. Supposing the two inputs should come from state elements 0 and 1, the encoded in data is thus 0x000001ffffffff.

out needs to specify that the output value is fixed length (var == 0b0). Supposing the output should be written to state element 2, the encoded out data is thus 0x02.

Variable length input and output values

Suppose you want to construct a command to call the following function:

function concatBytes32(bytes32[] inputs) external returns (bytes);

sel should be set to the function selector for this function, and target to the address of the deployed contract containing this function.

f should specify this is a delegatecall (0x00), in needs to specify one input value of variable length (var == 0b10000000), that is an array of 32-byte words (idx == 0b1000000). The remaining five input parameters are unneeded and should be set to 0xff. Supposing the input comes from state element 0, the encoded in data is thus 0x00c0ffffffffff.

out needs to specify that the output value is variable length (var == 0b10000000). Supposing the output value should be written to state element 1, the encoded out data is thus 0x81.

Command execution

Command execution takes place in 4 stages:

  1. Command decoding
  2. Input encoding
  3. Call
  4. Output decoding

Command decoding is straightforward and described above in "Command structure".

Input encoding

Input arguments must be collected from the state and assembled into a valid ABI-encoded string to be passed to the function being called. The vm allocates an array large enough to store the input data. Observing the var flag on each input argument specifier, it then either copies the value directly from the relevant state index to the input array, or writes out a pointer to the value, and appends the value to the array. The result is a valid ABI-encoded byte string. The function selector is inserted at the beginning of the input data in this stage.

Call

Next, the vm calls the target contract with the encoded input data. A delegatecall is normally used for vm library contracts, meaning the execution takes place in the vm's context rather than the contract's own, and a normal call is used for calling out to external contracts directly (like to an ERC20.transfer function). The intention is that users of the executor will themselves delegatecall it, meaning that all operations take place in the user's contract's context, or will seem to come directly from a user's contract address for external calls.

Output decoding

Finally, the return data is decoded by following the output argument specifier, in the same fashion as the 'input encoding' stage. Only one return value is supported.

weiroll's People

Contributors

arachnid avatar decanus avatar georgercarder avatar imbenwolf avatar mattdf avatar nmushegian avatar obatirou avatar timdaub avatar xmxanuel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

weiroll's Issues

Ensure that Executor can only ever be delegatecalled

If Executor is directly called without delegatecall, an operation it calls out to could call selfdestruct() and destroy the executor contract.

A guard has to be built in via a code-stored immutable var that holds the deployed contract address and checks address(this) != self prior to running any commands.

Use simple EOA proxy contract and test Executor through that

Currently all the tests just have the EOA call directly into Executor, but this isn't how it should be used in the general case as otherwise Executor is msg.sender if it calls out directly to a contract, and it is msg.sender also when any Libs call out to contracts.

The msg.sender for the receiver of an Executor outcall should be the user's contract wallet.

A simple contract like the following should suffice for test purposes:

contract WalletProxy {
    
    function forward(address target, bytes memory execdata) public returns (bytes memory){
        (bool success, bytes memory out) = target.delegatecall(execdata);
        if (!success){
            revert("Executor call failed"); // or grab the revert reason from delegatecall
        }
        return out;
    }
}

We can just use a wrapper to redirect the final planner.plan() call in the tests to set up the right args/target.

ERC721 library

We should have a library with common ERC721 operations:

  • Set approval for one token
  • Set approval for all
  • Transfer/safe transfer/transfer from
  • Get owner

ERC1155 library

We should have a library with common ERC1155 operations:

  • Set approval
  • Transfer/transferfrom
  • Check balance

binary literal notation is confusing

  • maybe it's me but I've never seen a binary notation where var == 0b implicitly means "0" but var == 1b essentially means "10000000".
  • Additionally, the user must understand ws == 1b is the idx of an input and so it is actually "1000000" (length=7 and not length=8)
  • The readme already uses the "0x" prefix for hexadecimal and it wouldn't assume that e.g. "0x01" means "0x0100"
  • ECMASCRIPT 6 has introduced a nicely-compatible notation for binary values. It is "0b" where I assume b stands for "binary" and where in "0x" x stands for hexadecimal.

how to encode returned data in before passing it to another command

Hi,

Maybe I am misunderstanding something, but I don't know how to do this..

contract First {
    struct Perm {
       address a1;
       bytes32 a2;
    }

   function callFirst() public returns(Perm[] memory perm) {
        perm = new Perm[](1);
        perm[0] = Perm(address(0), bytes32(0));
   }
}
contract Second {
    struct Perm {
       address a1;
       bytes32 a2;
    }

   function callSecond(uint b1, Perm[] memory p) public modifierProtected {
        // use parameters
   }
}
contract Third {
      struct Action {
             address to;
             value: 0;
             bytes data;
      }
     
    
     function great(uint k, Action[] memory ac) public {
          for(uint i = 0; i < ac.length; i++){
                 ac[i].to.call{}(data);
          }
     }

}

So, idea is, first command is calling callFirst, it will return Perm[] but I can't call Second only Third can do it, so, second command must be great. but I don't know how to encode ret inside Action's data.

let ret = planner.add(firstContract.callFirst()); 
planner.add(thirdContract.great([{
    to: "address of second contract",
    value: 0,
    // NOTE it fails here as it can't do encode.
    data: ethers.utils.defaultAbiCoder.encode(["uint", "tuple(uint8,address,address,bytes32)[]"], [5, ret])
}]));

isn't this possible ? problem is Third can't accept Perm[] directly (can't change it) and I also don't wanna create/deploy middle contract that will get the call, encode the stuff and then redirect it back to ThIrd which then redirects to Second.

Any idea ?

Flags encoding inconsistency between REAMDE and code

In README, tup is at MSB, and ext is the next:

  0   1   2   3   4   5   6   7
┌───┬───┬───────────────┬────────┐
│tup│ext│   reserved    │calltype│
└───┴───┴───────────────┴────────┘

but in the code, ext is at MSB, and tup is the next:

weiroll/contracts/VM.sol

Lines 12 to 13 in ebc6d70

uint256 constant FLAG_EXTENDED_COMMAND = 0x80;
uint256 constant FLAG_TUPLE_RETURN = 0x40;

Users who encode flags following the README would experience unexpected behaviors.

Fixtures in readme for examples are wrong length

Tests for CommandBuilder

We should write a thorough set of CommandBuilder tests based on wrapping it in a contract to expose the necessary functions.

Ether library

We should have a library that allows the caller to operate on their ether balance - send it, possibly other operations?

Perhaps this library should include WETH wrap/unwrap operations?

Write events library

We should have a library that emits various events when called. We should then remove the event from the executor and update all the tests to use the library instead.

Change how we handle raw calls

Instead of having a special 'raw call' function signature, we could reserve a register value as meaning 'the whole state'. When we see it, serialize the state (for call arguments) or deserialize and replace it (for return values).

Fails when a struct only contains types that are 32 bytes

Structs that only contain 32 bytes types are not encoded with a pointer. So when the planner calls hexDataSlice here it is actually removing the first value of the struct. And then when the pointer gets added back in CommandBuilder.sol, it is replacing the first value of the struct with the pointer value.

For example:

struct Numbers {
   uint256 a;
  uint256 b;
}

This value: { a: 1, b: 2 }, gets encoded like this:

0x
0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000002

As you can see there are just the two uint256's and no pointer. After passing through planner.ts and CommandBuilder.sol, this is the return value (I've removed the sig hash for clarity):

0x
0000000000000000000000000000000000000000000000000000000000000020
0000000000000000000000000000000000000000000000000000000000000002

You can see failing tests on this commit: https://github.com/PeterMPhillips/weiroll/commit/5a5da12f55147ff3eca6d18816d4bdc5b0f66892.

A similar issue can be seen when we have a struct that is a mixture of address and uint256, for example:

struct Mixed {
   uint256 a;
   address b;
}

Or the structs that are used by Uniswap V3: https://github.com/Uniswap/v3-periphery/blob/v1.0.0/contracts/interfaces/ISwapRouter.sol#L10

Handle often-used opcodes inside Executor rather than through libraries

There are a few core functionalities called through specific opcodes that are often used and Executor takes a large gas hit of 5k per call if these are implemented in libraries, whereas most of the opcodes have negligible gas cost if called directly.

List to consider implementing inside Executor directly as special case functions:

  • SHA3
  • ADDRESS
  • BALANCE
  • CALLER
  • CALLVALUE
  • GASPRICE
  • EXTCODEHASH
  • BLOCKHASH
  • COINBASE
  • TIMESTAMP
  • NUMBER
  • GASLIMIT
  • GAS
  • CREATE
  • CREATE2

Since many of these opcodes take no arguments or just one fixed-size argument, we could extend the flags bit to specify opcode call, and use immediate of the next command as the word for the argument if needed.

Main exceptions are CREATE/CREATE2 opcodes and SHA3, which can take arbitrary length data.

SHA3 costs 30 gas + 6*#words for hashing data, so for most calls it is far less than 5k, and probably worth special casing.

CREATE/CREATE2 might not be worth it, and the use case of creating a contract through executor is dubious.

Alternate state/input/output model for writing long scripts with same compact encoding

Instead of indexing into a single 127-value state array, you can have an arbitrary length state array and allow indexing from either the start or from the end, up to 64 items.

The basic idea is that most terms get reduced, and the terms that are reused frequently are computed early in the trace. Essentially you have 64 'globals' (including arguments) and 64 'locals'. The 'locals' fall out of scope after 64 operations. The globals stay accessible for the whole script.

Integration Question

First, thanks for sharing this project! I can't wait to try it out.

Second, I plan to integrate this project in the future with my own, especially with a rules engine written in Solidity called Wonka. There are several iterations of the engine contract(s), with the latest intended for deployment to Optimism (if that's possible or feasible). Basically, there are standard operators built into this engine (like addition) and custom operators (i.e., calling other contract methods) that can both be instantiated and used by the user of the engine, with the custom operators being defined by the user beforehand.

So my plan would be to integrate the weiroll contracts to enhance the custom operators, since it's using an inefficient approach right now. Out of curiosity, what do you think of such a plan? Do any immediate problems come to mind? Other than building a rules engine in Solidity seems somewhat crazy. :)

Thanks for any feedback!

'destructure' bit in call

Suggest defining ext to give 16 inputs and 16 outputs in the next word, rather than 32 extra inputs

Propagate revert reasons

When a call reverts, we should propagate the reason, along with some information about which command reverted (the index). Perhaps using the new Solidity 0.8.5 error objects?

Support capturing call success value ("catch exceptions")

One of the reserved bits could mean "don't rethrow exceptions, instead record result type"
It could be encoded as a pair, or it could be encoded using multiple return[1], or even 'flat' multiple return with call result flag as output 0 and the actual output at 1-n

#67

Tests for executor

We should write a thorough set of tests for the executor, particularly testing out edge/failure cases.

Refactor assembly code

Most of the remaining assembly is one of a few operations such as reading or writing a bytes32 to/from a bytes string. We should abstract these into helper functions so the code is clearer.

ERC20 library

We should have a library with common ERC20 ops:

  • Set approval
  • Transfer/transfer from
  • Read balance

Reuse dynamic state entries when possible

If the new entry is the same length as the existing one, we can overwrite it instead of allocating a new one.

If the new entry is shorter, we can also overwrite it - and update the length field.

Seeking reviewers for failing weiroll JS implementation in eth-fun

  • I'm the maintainer of eth-fun, a functional library for querying ETH RPC nodes
  • For us, weiroll is mainly interesting as it allows us to do multiple static calls in one eth_call - which can be nice when having to call many functions on an Ethereum node concurrently (we're using it to index with https://neume.network).
  • Anyways, I did a custom implementation of weiroll given the specification in this repository: attestate/eth-fun#46 as I didn't want to adopt the ethersjs dependency of weiroll.js
  • However, although I think I've understood everything well, a real call to WETH.totalSupply fails with a random portal VM I've found on mainnet: https://github.com/rugpullindex/eth-fun/pull/46/files#diff-a7b32e8ee1c22c5bcf754115ac539b0761a3f4cf8a86104ffc618027f46bb29bR16
  • I'm seeking reviewers for this implementation that can potentially spot the mistake I made that leads to the call reverting
  • ofc feel free to close this issue as it isn't really about the code here

Index out of bounds when writing tuple using WriteTuple

Decoded structure of below test case planner call.

         // Commands Array
    
           selector    flag       inputs       output             target address
        [  ----4-----   -1   --------6--------   -1   ----------------20----------------------
          '0xe63697c8   01   00 01 02 ff ff ff   ff   5f18c75abdae578b483e5f43f12a39cf75b973a9',  --- call
          '0x70a08231   02   01 ff ff ff ff ff      01   a0b86991c6218b36c1d19d4a2e9eb0ce3606eb48',  --- staticall
          '0x85f45c88   81   03 ff ff ff ff ff     83   e909128d38077bebd996692916865a5c24bfb522',  --- call
          '0x77e2eefa   02   83 01 00 ff ff ff   ff   c6e7df5e7b4f2a278906862b61205850344d4e7d'   --- statically
        ]
        
       // State array
        [
          '0x0000000000000000000000000000000000000000000000000000002df2362124',
          '0x00000000000000000000000059b670e9fa9d0a427751af201d676719a970857b', 
          '0x0000000000000000000000000000000000000000000000000000000000000000',
          '0x00000000000000000000000070997970c51812dc3a010c7d01b50e0d17dc79c8'
        ]

        ----------------------------------------------------------------------------------------------
        Command 1 input - 

        [0x0000000000000000000000000000000000000000000000000000002df2362124, -- shares
        0x00000000000000000000000059b670e9fa9d0a427751af201d676719a970857b, -- position address
        0x0000000000000000000000000000000000000000000000000000000000000000   -- 0
        ]
       
       output - neglected

        ----------------------------------------------------------------------------------------------
        Command 2 input - 
        [0x00000000000000000000000059b670e9fa9d0a427751af201d676719a970857b] -- position address

        output - 

        Output added at index 1 because of `01` output flag value
        ----------------------------------------------------------------------------------------------
       Command 3 input - 

        [0x00000000000000000000000070997970c51812dc3a010c7d01b50e0d17dc79c8] -- user address


        output - 

    It should update the state at index 3 because of `0x83` output flag value,i.e. 10000011 where MSB `1` represent dynamic length.  value while `0000011` should represent the index in state array i.e 3 while in the code  
https://github.com/weiroll/weiroll/blob/bec31fdebf1eac6ed0c8e9e6aa768ec361e2c9c7/contracts/VM.sol#L115 we are using the  
 decimal value of `0x83` as the index instead of 7 LSB.
        ----------------------------------------------------------------------------------------------

        Command 4 input is 

        [
            <Output from command 3>,                                              -- tuple value
            <Output from command 2>,                                             -- balance queried in 2nd command
            0x0000000000000000000000000000000000000000000000000000002df2362124      -- shares
        ]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.