a16z / halmos Goto Github PK

A symbolic testing tool for EVM smart contracts

License: GNU Affero General Public License v3.0

Python 70.74% Solidity 28.39% Shell 0.24% Dockerfile 0.64%

halmos's Introduction

Halmos

Halmos is a symbolic testing tool for EVM smart contracts. A Solidity/Foundry frontend is currently offered by default, with plans to provide support for other languages, such as Vyper and Huff, in the future.

You can read more in our post: "Symbolic testing with Halmos: Leveraging existing tests for formal verification."

Join the Halmos Telegram Group for any inquiries or further discussions.

Installation

pip install halmos

Or, if you want to try out the nightly build version:

pip install git+https://github.com/a16z/halmos

Usage

cd /path/to/src
halmos

For more details:

halmos --help

Examples

Refer to the getting started guide and the examples directory.

Contributing

Refer to the contributing guidelines, and explore the list of issues labeled "good first issue" or "help wanted."

Disclaimer

These smart contracts and code are being provided as is. No guarantee, representation or warranty is being made, express or implied, as to the safety or correctness of the user interface or the smart contracts and code. They have not been audited and as such there can be no assurance they will work as intended, and users may experience delays, failures, errors, omissions or loss of transmitted information. THE SMART CONTRACTS AND CODE CONTAINED HEREIN ARE FURNISHED AS IS, WHERE IS, WITH ALL FAULTS AND WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING ANY WARRANTY OF MERCHANTABILITY, NON-INFRINGEMENT OR FITNESS FOR ANY PARTICULAR PURPOSE. Further, use of any of these smart contracts and code may be restricted or prohibited under applicable law, including securities laws, and it is therefore strongly advised for you to contact a reputable attorney in any jurisdiction where these smart contracts and code may be accessible for any questions or concerns with respect thereto. Further, no information provided in this repo should be construed as investment advice or legal advice for any particular facts or circumstances, and is not meant to replace competent counsel. a16z is not liable for any use of the foregoing, and users should proceed with caution and use at their own risk. See a16z.com/disclosures for more info.

halmos's People

Contributors

Stargazers

Watchers

Forkers

web3dopamine sarvex 0xfuje viraj124 karmacoma-eth le-kag zobront gacwr ncitron apeworx wbnns henryding0 plotchy baolean qgarchery waifu-capital davifbtc altyni86 open-consulting sentry0x coreyjj1679 decentered rappie web3secresearch lowprivuser luksgrin whbjzzwjxq nick199910 aviggiano 0kage-eth bmddn9448 halilkaraca8788 cryptuon henry-hz shanzson 0xzra baris-h cos sabnock01 saurabhchalke pillip bunsdev emperororokusaki rndquu 0xgoerlimainnet ohhkaneda c6ai thesdes orangesantra princeleonal tr991tr siborn art-aaaa ali-kh9 iris-jsx 51n44 byungheonlee silur

halmos's Issues

non-hexadecimal number found in fromhex() arg

Describe the bug

Simple test (below) leading to the following error:

Error: None failed: ValueError: non-hexadecimal number found in fromhex() arg at position 6242

To Reproduce

function testZach__indexCannotBeZero(uint h) public {
    vm.assume(h != 0);
    uint index = (1 & h) + ((1 ^ h) >> 1);
    assert(index != 0);
}

Environment:

OS: macOS
Python version: 3.11.2
Halmos version: 0.1.1

Additional context
Add any other context about the problem here.

earlier processing @custom:halmos tags

Currently, the @custom:halmos tags are processed after the test functions have been identified, making some options (e.g., --function) not applicable within the tags. To make all the options available, the halmos tags need to be processed earlier.

Counterexample: unknown

Describe the bug

I was performing a comparison between halmos and forge test against SMTChecker. I have used an adaptation of the following code

// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.8.0;

contract Monotonic {
function f(uint x) internal pure returns (uint) {
require(x < type(uint128).max);
return x * 42;
}

function inv(uint a, uint b) public pure {
    require(b > a);
    assert(f(b) > f(a));
}

}
as found in https://docs.soliditylang.org/en/v0.8.17/smtchecker.html.

To Reproduce
I ran the command "forge init" in a folder and added this function

function f(uint x) public pure returns (uint) {
    if(x < type(uint128).max)
        return x * 42;
    else return x;
}

to the file Counter.sol and the test function

function testf(uint a, uint b) public view {
if(b > a)
assert(counter.f(b) > counter.f(a));
else
assert(true);
}
to the file Counter.t.sol. Thus I ran "forge test" and halmos and got this.

acmLL % forge test
[⠒] Compiling...
No files changed, compilation skipped
Running 3 tests for test/Counter.t.sol:CounterTest
[PASS] testIncrement() (gas: 28435)
[PASS] testSetNumber(uint256) (runs: 256, μ: 27908, ~: 28375)
[PASS] testf(uint256,uint256) (runs: 256, μ: 3601, ~: 6448)
Test result: ok. 3 passed; 0 failed; finished in 54.41ms

acmLL % halmos
Running 3 tests for test/Counter.t.sol:CounterTest
[PASS] testIncrement() (paths: 1/2, time: 0.21s, bounds: [])
[PASS] testSetNumber(uint256) (paths: 1/2, time: 0.26s, bounds: [])
[FAIL] testf(uint256,uint256) (paths: 4/9, time: 61.82s, bounds: [])
Counterexample: unknown
Symbolic test result: 2 passed; 1 failed

Environment:

macOS 12.6.5
Python version: 3.9.6
Halmos and other dependency versions:
Package Version

altgraph 0.17.2
cbor2 5.4.6
crytic-compile 0.3.1
future 0.18.2
halmos 0.0.7
macholib 1.15.2
packaging 23.1
pip 21.2.4
pycryptodome 3.17
setuptools 58.0.4
six 1.15.0
solc-select 1.0.3
wheel 0.37.0
z3-solver 4.12.1.0

Additional context
Initially I used the require version (instead of the adapted if-based version) as presented in https://docs.soliditylang.org/en/v0.8.17/smtchecker.html, forge test detected a revert case and halmos gave the same report.

Add vyper support

Halmos is an EVM-level tool, which means that in principle we can target any bytecode, regardless of the source language used (Solidity, Vyper, Huff, ...)

However at the moment at the moment we only provide a Foundry frontend, which means that halmos tests need to be expressed in foundry-style Solidity contracts.

This task is about adding a vyper frontend -- here is a rough sketch of what will be needed:

detecting the project framework
invoking the right tool to build the project
parse the build output (we need the contract names and their bytecode, as well as function names and custom annotations if these are supported)
specify the search target -- halmos currently looks for assertion failures, i.e. executions that finish with Panic(1). Does this work the same in vyper or does it need to be changed?
halmos-cheatcodes are available as a solidity library, do we need an equivalent in vyper to write tests comfortably?

So mostly this is about wiring the inputs and outputs correctly, I don't expect changes in the core sevm execution engine.

Any help from devs proficient in vyper would be appreciated! 🙏

missing CREATE failure semantics

Describe the bug
missing semantics of CREATE opcode in case that the contract creation code reverts.

related to #109

Introduce EVM version setting

Problem

We don't have a way to specify what version of the EVM we support. So for instance when support for PUSH0 was added, it means that we now implicitly default to the Shanghai fork.

Solution

Ideally we would have a proper EVM version setting somewhere, and while defaulting to Shanghai or whatever other version is fine, it might become important to specify against what version you're testing (especially for contracts meant to be deployed on other EVM chains).

vm.prank doesn't work during contract creation

Describe the bug

During contract creation in a test, it seems that msg.sender is always aaaa0001 and not the pranked address

To Reproduce

// SPDX-License-Identifier: AGPL-3.0
pragma solidity >=0.8.0 <0.9.0;

import {Test} from "forge-std/Test.sol";

contract Dummy {
    address public caller;

    constructor() {
        caller = msg.sender;
    }
 }

contract PrankTest is Test {
    function checkPrankConstructor(address user) public {
        vm.prank(user);
        Dummy dummy = new Dummy();

        assertEq(dummy.caller(), user);
    }
}

Running 1 tests for test/Prank.t.sol:PrankTest
[FAIL] testPrankConstructor(address) (paths: 1/4, time: 0.17s, bounds: [])
Counterexample: [
    p_user_address = 0x80000000000000000000000000000000aaaa0001]

Environment:

OS: macOS
Python version: 3.10.9
Halmos and other dependency versions: main branch

Additional context
Add any other context about the problem here.

vm.createFork() not supported

Describe the bug
halmos threw an error when the setUp() function invokes vm operations.

The error on the console was

Traceback (most recent call last):
File "/Users/.../Library/Python/3.9/bin/halmos", line 8, in
sys.exit(main())
File "/Users/.../Library/Python/3.9/lib/python/site-packages/halmos/main.py", line 474, in main
setup_ex = setup(hexcode, abi, srcmap, srcs, args, setup_sig, options)
File "/Users/.../Library/Python/3.9/lib/python/site-packages/halmos/main.py", line 132, in setup
if (setup_opcode != 'STOP' and setup_opcode != 'RETURN') or setup_ex.failed: raise ValueError('setUp() failed')
ValueError: setUp() failed

To Reproduce

Create a simple test in a foundry project as below

// SPDX-License-Identifier: AGPL-3.0-only
pragma solidity ^0.8.13;

import "forge-std/Test.sol";

contract MyTest is Test {

    function setUp() public {
       vm.createFork(vm.envString("RPC_URL_MAINNET"));
    }

    function testFailX() pure public {
        revert();
    }
}

run forge test
run halmos

Environment:

OS: macOS
Python version: 3.9.6
Halmos and other dependency versions:
- halmos 0.0.4
- crytic-compile 0.3.0
- z3-solver 4.12.1.0

Additional context

If we comment out the vm call like this:

   function setUp() public {
       // vm.createFork(vm.envString("RPC_URL_MAINNET"));
    }

halmos can do the verification, but it produces opposite result for testFail... tests. When foundry reports a PASS, halmos reports a FAILURE, and vice versa.

Support for static storage arrays

Describe the bug
When trying to use a custom getter of an array of integers in storage, halmos fails with:
ValueError: invalid literal for int() with base 10: 'p_i_uint256'

To Reproduce
Run halmos on the following contract:

contract GetterTest {
    uint256[3] v;

    function getV(uint256 i) public view returns (uint256) {
        if (i >= 3) return 0;
        return v[i];
    }

    function testGetter(uint256 i) public view {
        assert(i >= 3 || getV(i) == v[i]);
    }
}

Environment:

OS: linux
Python version: 3.11.13
Halmos and other dependency versions: halmos 0.0.8

Unable to install halmos on macOs ventura 13.0.1

I installed halmos with pip3 install halmos but on running halmos I get zsh command not found I checked my path in zrsh file and it is correct is there something that i am missing ?

Thanks

Display foundry-style traces

Problem

When you run forge test -vvv, you get these nice traces:

This is great devex because it helps the developer make sense of how the error happens and the path to get there.

The thing is, halmos deals with not just one but many paths and it would be super helpful to visualize the explored paths in this way.

Solution

At the right level of verbosity, we should display not just the counterexample but also the path that got there.

At the right debug level, we could also display all explored paths like this (1 path = 1 trace), to make sure the path exploration phase is hitting the right areas.

Bonus points for displaying:

nice contract names
nice function names
pretty printed calldata
logs

The toy tests are missing with the new halmos version

Describe the bug
Toy tests are missing check_ keyword so there is nothing to test

whitehat@fb6550204d45:~/halmos/examples/toy$ halmos
[⠊] Compiling...
[⠢] Compiling 2 files with 0.8.21
[⠆] Solc 0.8.21 finished in 90.13ms
Compiler run successful!
Error: No tests with the prefix `check_`

To Reproduce

git clone https://github.com/a16z/halmos/
cd halmos/examples/toy
halmos

Environment:

OS: Ubuntu 22:04
Python version: 3.9
Halmos and other dependency versions: [e.g., pip list]

whitehat@fb6550204d45:~/halmos/examples/toy$ pip list
Package                   Version
------------------------- -------------
abch-tree-sitter          1.1.1
abch-tree-sitter-solidity 1.2.1
aiofiles                  0.8.0
aiohttp                   3.8.5
aiosignal                 1.3.1
asttokens                 2.0.5
async-timeout             4.0.3
attrs                     23.1.0
backcall                  0.2.0
bitarray                  2.8.1
blinker                   1.4
charset-normalizer        3.2.0
click                     8.1.6
cryptography              3.4.8
cytoolz                   0.12.2
dbus-python               1.2.18
decorator                 5.1.1
distro                    1.7.0
eth-abi                   4.1.0
eth-account               0.8.0
eth-hash                  0.5.2
eth-keyfile               0.6.1
eth-keys                  0.4.0
eth-rlp                   0.3.0
eth-typing                3.4.0
eth-utils                 2.2.0
exceptiongroup            1.1.3
executing                 1.2.0
frozenlist                1.4.0
graphviz                  0.19.2
hexbytes                  0.3.1
httplib2                  0.20.2
idna                      3.4
importlib-metadata        4.6.4
iniconfig                 2.0.0
intervaltree              3.1.0
ipdb                      0.13.13
ipython                   8.14.0
jedi                      0.19.0
jeepney                   0.7.1
keyring                   23.5.0
launchpadlib              1.10.16
lazr.restfulclient        0.14.4
lazr.uri                  1.0.6
markdown-it-py            3.0.0
matplotlib-inline         0.1.6
mdurl                     0.1.2
more-itertools            8.10.0
multidict                 6.0.4
networkx                  2.8.8
oauthlib                  3.2.0
packaging                 23.1
parsimonious              0.9.0
parso                     0.8.3
pathvalidate              2.5.2
pexpect                   4.8.0
pickleshare               0.7.5
pip                       22.0.2
pluggy                    1.2.0
prompt-toolkit            3.0.39
ptyprocess                0.7.0
pure-eval                 0.2.2
pycryptodome              3.18.0
pydantic                  1.10.12
Pygments                  2.16.1
PyGObject                 3.42.1
PyJWT                     2.3.0
pyparsing                 2.4.7
pytest                    7.4.0
python-apt                2.4.0+ubuntu2
regex                     2023.8.8
rich                      13.5.2
rich-click                1.6.1
rlp                       3.0.0
SecretStorage             3.3.1
setuptools                59.6.0
six                       1.16.0
sortedcontainers          2.4.0
stack-data                0.6.2
tblib                     1.7.0
tomli                     2.0.1
toolz                     0.12.0
traitlets                 5.9.0
typing_extensions         4.7.1
unicorn                   1.0.2
wadllib                   1.3.6
watchdog                  2.2.1
wcwidth                   0.2.6
websocket-client          1.6.1
wheel                     0.37.1
woke                      3.5.0
yarl                      1.9.2
z3-solver                 4.12.2.0
zipp                      1.0.0

Additional context
I was performing some tests to check that halmos was well configured at the auditor-toolbox docker image and notice the toy version is old.

Recommendation: Add the check_ keyword to all the toy tests

Support for portfolio solving

Problem

We currently only ever invoke Z3 to solve both path queries and assertion queries. This works well in general, but there is evidence in the literature that different heuristics can perform better on specific queries, so we can expect that some other solvers (e.g. yices, cvc, boolector, bitwuzla...) could potentially give results much sooner than Z3.

Solution

When we have a query, we should:

dump it to smtlib2 format on disk
invoke z3 and the other available solvers on it concurrently
as soon as we get a result from a solver (sat/unsat -- ignore unknown) we can cancel the other ones, they lost the race

Finer points

I operate on the assumption that most path queries are fast, and assertion queries can get expensive, therefore this would only apply to assertion queries. But maybe that's wrong? Might be interesting to see if there's any benefit to applying this to potentially expensive path queries too, who knows
We may also not want to cancel the other ones right away for research purposes, might be interesting to get some insight if some solvers are better on specific kinds/sizes or queries, or maybe there's one that consistently outperforms the others
packaging and distributing solvers sounds tricky, we may want to rely on automatically discovering what solvers are available locally? We would be looking in the PATH for available well-known binaries and invoke them, which outsources installing/building them to developers, as least initially.

Uncaught exceptions

Running 9 tests for test/Vault.t.sol:SandclockLUSDTest
Traceback (most recent call last):
File "/Users/alexandremota/Library/Python/3.9/bin/halmos", line 8, in
sys.exit(main())
File "/Users/alexandremota/Library/Python/3.9/lib/python/site-packages/halmos/main.py", line 445, in main
exitcode = run(hexcode, abi, srcmap, srcs, funname, funsig, funselector, arrlen, args, setup, options)
File "/Users/alexandremota/Library/Python/3.9/lib/python/site-packages/halmos/main.py", line 215, in run
(setup_exs, setup_steps) = sevm.run(setup_ex)
File "/Users/alexandremota/Library/Python/3.9/lib/python/site-packages/halmos/sevm.py", line 962, in run
self.create(ex, stack, step_id, out)
File "/Users/alexandremota/Library/Python/3.9/lib/python/site-packages/halmos/sevm.py", line 649, in create
(new_exs, new_steps) = self.run(self.mk_exec(
File "/Users/alexandremota/Library/Python/3.9/lib/python/site-packages/halmos/sevm.py", line 977, in run
ex.sstore(ex.st.pop(), ex.st.pop())
File "/Users/alexandremota/Library/Python/3.9/lib/python/site-packages/halmos/sevm.py", line 256, in sstore
slot, keys = int(str(offsets[0])), offsets[1:]
ValueError: invalid literal for int() with base 10: 'sha3_var1'

Z3 Type

I have typed: halmos, and got this

Running 1 tests for test/Vault.t.sol:SLUSDTest
Traceback (most recent call last):
File "/usr/local/bin/halmos", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/site-packages/halmos/main.py", line 391, in main
exitcode = run(hexcode, abi, srcmap, srcs, funname, funsig, funselector, arrlen, args, options)
File "/usr/local/lib/python3.10/site-packages/halmos/main.py", line 206, in run
gen_model(args, models, idx, ex)
File "/usr/local/lib/python3.10/site-packages/halmos/main.py", line 273, in gen_model
sol3.from_string(query)
File "/usr/local/lib/python3.10/site-packages/z3/z3.py", line 7214, in from_string
Z3_solver_from_string(self.ctx.ref(), self.solver, s)
File "/usr/local/lib/python3.10/site-packages/z3/z3core.py", line 3974, in Z3_solver_from_string
elems.Check(a0)
File "/usr/local/lib/python3.10/site-packages/z3/z3core.py", line 1482, in Check
raise self.Exception(self.get_error_message(ctx, err))
z3.z3types.Z3Exception: b'(error "line 250 column 75: unknown constant evm_div (( BitVec 256) (_ BitVec 256)) ")\n'

Features to support

Features to support:

Known external contract calls (#21)
Known external account interactions
Pretty-print counterexamples (storage variable names, external calls, etc)
Immutable storage variables (#21)
Struct type parameters
Overwrite bytecode for new accounts created by setup (#75)

Cheatcodes to support:

prank (#62)
expectRevert / testFail
expectEmit

Support vm.addr(privateKey) cheatcode

Is your feature request related to a problem? Please describe.

Using makeAddr("name") (from forge-std) in a test results in the following error:

Warning: setUp() execution encountered an issue at STATICCALL: Unsupported cheat code: calldata = Concat(0xffa18649, sha3_var_00)

Describe the solution you'd like

Supporting non-symbolic labeled addresses would be a nice little quality of life upgrade

Trouble running halmos after a fresh install

Describe the bug
Getting this error after install;

(base) ➜  curta git:(master) ✗  halmos 
Traceback (most recent call last):
  File "/home/eugenio/.local/bin/halmos", line 5, in <module>
    from halmos.__main__ import main
  File "/home/eugenio/.local/lib/python3.10/site-packages/halmos/__main__.py", line 19, in <module>
    sys.set_int_max_str_digits(0)
AttributeError: module 'sys' has no attribute 'set_int_max_str_digits'
(base) ➜  curta git:(master) ✗

To Reproduce
I have install halmos using pip3.10 install halmos, the pip install halmos wasnt working for me

Environment:

OS: Linux ubuntu
Python version: [e.g., python --version]
Halmos and other dependency versions: [e.g., pip list]

Additional context
I also try creating a venv with no luck :(

bug: false positive for equivalence of two calls

Describe the bug
Halmos reports two counterexamples for testing equivalence on a function with an external call even though the call is the same pure function.

To Reproduce

pragma solidity ^0.8.0;

import "forge-std/Test.sol";

contract TestArithmetic is Test {
    function test_add_uint256(uint256 a, uint256 b) public {
        assertEqCall(
            address(this), abi.encodeCall(this.solidityAddUint, (a, b)), abi.encodeCall(this.solidityAddUint, (a, b))
        );
    }

    function solidityAddUint(uint256 a, uint256 b) public pure returns (uint256 c) {
        c = a + b;
    }
}

Environment:

OS: macOS
Python version: Python 3.9.16
Halmos and other dependency versions:

Package          Version  Editable project location
---------------- -------- ---------------------------
cbor2            5.4.6
crytic-compile   0.3.0
halmos           0.0.4
packaging        23.0
pip              23.0.1
prettytable      3.6.0
pycryptodome     3.17
setuptools       65.6.3
slither-analyzer 0.9.2    /Users/lain/git/tob/slither
wcwidth          0.2.6
z3-solver        4.12.1.0

Additional context
The output generated (didn't include the second counterexample):

halmos -v

Running 1 tests for test/Arithmetic.t.sol:TestArithmetic
[FAIL] test_add_uint256(uint256,uint256) (paths: 2/5, time: 0.40s, bounds: [])
Counterexample: [msg_sender = 0, msg_value = 0, p_a_uint256 = 0, p_b_uint256 = 0, this_address = 0]
# 3 / 5
PC: this_address 352 STOP - 0 <generated>
Stack: [3074223460]
Memory:
- 0x0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x20: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x40: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 76]
- 0x60: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x80: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 68]
- 0xa0: [192, 246, 56, 242, Extract(255, 248, p_a_uint256), Extract(247, 240, p_a_uint256), Extract(239, 232, p_a_uint256), Extract(231, 224, p_a_uint256), Extract(223, 216, p_a_uint256), Extract(215, 208, p_a_uint256), Extract(207, 200, p_a_uint256), Extract(199, 192, p_a_uint256), Extract(191, 184, p_a_uint256), Extract(183, 176, p_a_uint256), Extract(175, 168, p_a_uint256), Extract(167, 160, p_a_uint256), Extract(159, 152, p_a_uint256), Extract(151, 144, p_a_uint256), Extract(143, 136, p_a_uint256), Extract(135, 128, p_a_uint256), Extract(127, 120, p_a_uint256), Extract(119, 112, p_a_uint256), Extract(111, 104, p_a_uint256), Extract(103, 96, p_a_uint256), Extract(95, 88, p_a_uint256), Extract(87, 80, p_a_uint256), Extract(79, 72, p_a_uint256), Extract(71, 64, p_a_uint256), Extract(63, 56, p_a_uint256), Extract(55, 48, p_a_uint256), Extract(47, 40, p_a_uint256), Extract(39, 32, p_a_uint256)]
- 0xc0: [Extract(31, 24, p_a_uint256), Extract(23, 16, p_a_uint256), Extract(15, 8, p_a_uint256), Extract(7, 0, p_a_uint256), Extract(255, 248, p_b_uint256), Extract(247, 240, p_b_uint256), Extract(239, 232, p_b_uint256), Extract(231, 224, p_b_uint256), Extract(223, 216, p_b_uint256), Extract(215, 208, p_b_uint256), Extract(207, 200, p_b_uint256), Extract(199, 192, p_b_uint256), Extract(191, 184, p_b_uint256), Extract(183, 176, p_b_uint256), Extract(175, 168, p_b_uint256), Extract(167, 160, p_b_uint256), Extract(159, 152, p_b_uint256), Extract(151, 144, p_b_uint256), Extract(143, 136, p_b_uint256), Extract(135, 128, p_b_uint256), Extract(127, 120, p_b_uint256), Extract(119, 112, p_b_uint256), Extract(111, 104, p_b_uint256), Extract(103, 96, p_b_uint256), Extract(95, 88, p_b_uint256), Extract(87, 80, p_b_uint256), Extract(79, 72, p_b_uint256), Extract(71, 64, p_b_uint256), Extract(63, 56, p_b_uint256), Extract(55, 48, p_b_uint256), Extract(47, 40, p_b_uint256), Extract(39, 32, p_b_uint256)]
- 0xe0: [Extract(31, 24, p_b_uint256), Extract(23, 16, p_b_uint256), Extract(15, 8, p_b_uint256), Extract(7, 0, p_b_uint256), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x100: [0, 0, 0, 68, 192, 246, 56, 242, Extract(255, 248, p_a_uint256), Extract(247, 240, p_a_uint256), Extract(239, 232, p_a_uint256), Extract(231, 224, p_a_uint256), Extract(223, 216, p_a_uint256), Extract(215, 208, p_a_uint256), Extract(207, 200, p_a_uint256), Extract(199, 192, p_a_uint256), Extract(191, 184, p_a_uint256), Extract(183, 176, p_a_uint256), Extract(175, 168, p_a_uint256), Extract(167, 160, p_a_uint256), Extract(159, 152, p_a_uint256), Extract(151, 144, p_a_uint256), Extract(143, 136, p_a_uint256), Extract(135, 128, p_a_uint256), Extract(127, 120, p_a_uint256), Extract(119, 112, p_a_uint256), Extract(111, 104, p_a_uint256), Extract(103, 96, p_a_uint256), Extract(95, 88, p_a_uint256), Extract(87, 80, p_a_uint256), Extract(79, 72, p_a_uint256), Extract(71, 64, p_a_uint256)]
- 0x120: [Extract(63, 56, p_a_uint256), Extract(55, 48, p_a_uint256), Extract(47, 40, p_a_uint256), Extract(39, 32, p_a_uint256), Extract(31, 24, p_a_uint256), Extract(23, 16, p_a_uint256), Extract(15, 8, p_a_uint256), Extract(7, 0, p_a_uint256), Extract(255, 248, p_b_uint256), Extract(247, 240, p_b_uint256), Extract(239, 232, p_b_uint256), Extract(231, 224, p_b_uint256), Extract(223, 216, p_b_uint256), Extract(215, 208, p_b_uint256), Extract(207, 200, p_b_uint256), Extract(199, 192, p_b_uint256), Extract(191, 184, p_b_uint256), Extract(183, 176, p_b_uint256), Extract(175, 168, p_b_uint256), Extract(167, 160, p_b_uint256), Extract(159, 152, p_b_uint256), Extract(151, 144, p_b_uint256), Extract(143, 136, p_b_uint256), Extract(135, 128, p_b_uint256), Extract(127, 120, p_b_uint256), Extract(119, 112, p_b_uint256), Extract(111, 104, p_b_uint256), Extract(103, 96, p_b_uint256), Extract(95, 88, p_b_uint256), Extract(87, 80, p_b_uint256), Extract(79, 72, p_b_uint256), Extract(71, 64, p_b_uint256)]
- 0x140: [Extract(63, 56, p_b_uint256), Extract(55, 48, p_b_uint256), Extract(47, 40, p_b_uint256), Extract(39, 32, p_b_uint256), Extract(31, 24, p_b_uint256), Extract(23, 16, p_b_uint256), Extract(15, 8, p_b_uint256), Extract(7, 0, p_b_uint256), 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x160: [0, 0, 0, 0, 0, 0, 0, 96, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 113, 9, 112, 158, 207, 169, 26, 128, 98, 111, 243, 152]
- 0x180: [157, 104, 246, 127, 91, 29, 209, 45, 102, 97, 105, 108, 101, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x1a0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x1c0: [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x1e0: [0, 0, 0, 0, 0, 0, 0, 100, 112, 202, 16, 187, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 113, 9, 112, 158, 207, 169, 26, 128]
- 0x200: [98, 111, 243, 152, 157, 104, 246, 127, 91, 29, 209, 45, 102, 97, 105, 108, 101, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x220: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x240: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 112, 202, 16, 187, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 113, 9, 112, 158]
- 0x260: [207, 169, 26, 128, 98, 111, 243, 152, 157, 104, 246, 127, 91, 29, 209, 45, 102, 97, 105, 108, 101, 100, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x280: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x2a0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- 0x2c0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Storage:
- this_address: {0: {0: Concat(Extract(255, 16, storage_slot_0_0), 1, Extract(7, 0, storage_slot_0_0))}}
Balance:
- this_address: this_balance + msg_value - 0 - 0 - 0
Path:
- msg_value == 0
- Not(call1 == 0)
- call2 == 0
- Not(call1 == 0)
- Not(call1 == 0)
- Not(call1 == 0)
- Not(call1 == 0)
- Not(call1 == 0)
- Not(call1 == 0)
- call2 == 0
- Not(extcodesize(645326474426547203313410069153905908525362434349) == 0)
Output: None
Log: [([29485693714967335757563038618841744472215891622979272243827124658718922284880], 429049853758163107186368799942587076079339706258956588087153966199096448962356661055866306375861046696051322400401741685125038281870778983368639235038904320), ([95180208686301398394554031582401412094073538926388354310127026171774931619608], 11505236063118821809467553221049758295155052665230762065499525194094891251552260362809157039377781374726210488638842056300353482370348462500409570082870200511689663829945559818529136259043050327177290308227941601695126878464399625111430179431737320830335505806068516514195737649102622530195829220243517141417984), ([95180208686301398394554031582401412094073538926388354310127026171774931619608], 11505236063118821809467553221049758295155052665230762065499525194094891251552260362809157039377781374726210488638842056300353482370348462500409570082870200511699916357724358568980581405894333766482111596248363362125840459118798500724025206904028642650840132123536673437250057636019259959519479623151633274241024)]
Storage updates:
SHA3 hashes:
External calls:
- (call1, call_544(1, gas(1), Concat(0, Extract(159, 0, this_address)), 0, Concat(3237361906, p_a_uint256, p_b_uint256)), None)
- (call2, call_544(2, gas(2), Concat(0, Extract(159, 0, this_address)), 0, Concat(3237361906, p_a_uint256, p_b_uint256)), None)
- (call3, call_800(3,
         gas(3),
         645326474426547203313410069153905908525362434349,
         0,
         2937815620610723459522636952510386841785114031644826774648436411116173207964843441748423061353483029633030135072093998600777156395894761540308991058809448955476838734537982926037376347404276804152882314295846787711769047727657335210003922945), None)

Not reverting storage states

Describe the bug
Failed to revert storage states.

[Reported by Piotr at Halborn]

To Reproduce

contract C {
  uint256 public num;
  function set1(uint256 x) public {
    num = x;
    revert("blah");
  }
}

contract CTest {
    function checkRevert1(uint256 x) public {
        C c = new C();
        require(x != 0);
        (bool result, ) = address(c).call(abi.encodeWithSignature("set1(uint256)", x));
        assert(!result);
        assert(c.num() != x); // fail
    }
}

add div axioms

for each uninterpreted abstraction t = f_div(x,y):

add the following axioms:

t == 0  if y == 0

t == 0  if x < y
t == 1  if x == y > 0

t * y <= x
t * y > x - y  if x >= y > 0

no overflow in bitvector arithmetic.

need to evaluate different ways of instantiating conditional (in)equalities.

Add support for symbolic storage

Describe the bug

External call encountered an issue at SSTORE: symbolic storage base slot: Concat(0, Extract(159, 0, p_alice_address))

To Reproduce

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.15;

import "forge-std/Test.sol";
import {SymTest} from "halmos-cheatcodes/SymTest.sol";

contract Dummy {}

/// @notice don't use, this is very buggy on purpose
contract SmolWETH {
    function deposit() external payable {
        // bug: we stomp any existing balance
        assembly {
            sstore(caller(), callvalue())
        }
    }

    function withdraw(uint256) external {
        assembly {
            // revert if msg.value > 0
            if gt(callvalue(), 0) {
                revert(0, 0)
            }

            let amount := sload(caller())
            let success := call(gas(), caller(), amount, 0, 0, 0, 0)

            // bug: we always erase the balance, regardless of transfer success
            // bug: we should erase the balance before making the call
            sstore(caller(), 0)
        }
    }

    function balanceOf(address account) external view returns (uint256) {
        assembly {
            mstore(0, sload(account))
            return(0, 0x20)
        }
    }
}

contract SmolWETHTest is Test, SymTest {
    SmolWETH weth;

    function setUp() public {
        weth = new SmolWETH();
    }

    function test_deposit_once(address alice, uint256 amount) external {
        // fund alice
        vm.deal(alice, amount);

        // alice deposits
        vm.prank(alice);
        weth.deposit{value: amount}();

        // alice's balance is updated
        assertEq(weth.balanceOf(alice), amount);
    }
}

halmos --contract SmolWETHTest --function test_deposit_once

Struct type parameters not supported

Applied halmos on the attached (flat version) project and got this error message

Running 4 tests for contracts/test/TestCurveSwapper.sol:TestCurveSwapper
Traceback (most recent call last):
File "/usr/local/bin/halmos", line 8, in
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.11/site-packages/halmos/main.py", line 478, in main
exitcode = run(setup_ex, abi, funname, funsig, funselector, arrlen, args, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/halmos/main.py", line 171, in run
raise ValueError('Not supported', param_type) # TODO: support struct types
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: ('Not supported', 'tuple')

Thanks!!!

TestCurveSwapper_FlatVersion.sol.zip

Add support for multiple compilation units

I tried running halmos on a project with multiple solidity versions and it failed with this error. Looks to be dependent on crytic fixing their issue: crytic/slither#731

Traceback (most recent call last):
  File "/Users/goldfinch/Library/Python/3.10/bin/halmos", line 8, in <module>
    sys.exit(main())
  File "/Users/goldfinch/Library/Python/3.10/lib/python/site-packages/halmos/__main__.py", line 352, in main
    cryticCompile = CryticCompile(**vars(args))
  File "/Users/goldfinch/Library/Python/3.10/lib/python/site-packages/crytic_compile/crytic_compile.py", line 110, in __init__
    self._compile(**kwargs)
  File "/Users/goldfinch/Library/Python/3.10/lib/python/site-packages/crytic_compile/crytic_compile.py", line 530, in _compile
    self._platform.compile(self, **kwargs)
  File "/Users/goldfinch/Library/Python/3.10/lib/python/site-packages/crytic_compile/platform/foundry.py", line 90, in compile
    hardhat_like_parsing(crytic_compile, self._target, build_directory, self._target)
  File "/Users/goldfinch/Library/Python/3.10/lib/python/site-packages/crytic_compile/platform/hardhat.py", line 49, in hardhat_like_parsing
    os.listdir(build_directory), key=lambda x: os.path.getmtime(Path(build_directory, x))
FileNotFoundError: [Errno 2] No such file or directory: '/Users/goldfinch/dev/mono/packages/protocol/out/build-info'
goldfinch@Daltons-MBP-2 protocol % halmos --foundry-out-directory artifacts
Traceback (most recent call last):
  File "/Users/goldfinch/Library/Python/3.10/bin/halmos", line 8, in <module>
    sys.exit(main())
  File "/Users/goldfinch/Library/Python/3.10/lib/python/site-packages/halmos/__main__.py", line 356, in main
    if len(cryticCompile.compilation_units) > 1: raise ValueError('Multiple compilation units', cryticCompile.compilation_units)
ValueError: ('Multiple compilation units', {'8615e798fbbb13200c971b7468648a8a': <crytic_compile.compilation_unit.CompilationUnit object at 0x1005ee7a0>, '20dd466e29c686f926c505a5ccb8a769': <crytic_compile.compilation_unit.CompilationUnit object at 0x115ca8400>, 'af95935083bd0837625e8f97dcbc31d1': <crytic_compile.compilation_unit.CompilationUnit object at 0x123db9570>})

Provide a halmos github action

Problem

If you're a halmos user, how hard is it to integrate in your CI?

Solution

Maybe having a nicely packaged halmos GitHub Action would help? Real question, I haven't really thought too deeply about it.

Library linking

Describe the bug
Halmos doesn't resolve library linking.

To Reproduce

library Lib {                                                                                                                          
  function foo() public returns (uint) { return 1; }                                                                                   
  function bar() internal returns (uint) { return 2; }                                                                                 
}                                                                                                                                      
                                                                                                                                       
contract LibPublicTest {                                                                                                                     
  function testFoo() public {                                                                                                          
    assert(Lib.foo() == 1); // library linking placeholder error                                                                       
  }                                                                                                                                    
}

contract LibInternalTest {                                                                                                                     
  function testBar() public {                                                                                                          
    assert(Lib.bar() == 2); // this is fine because internal library functions are inlined                                             
  }                                                                                                                                    
}

failed to run halmos in windows

@daejunpark Hello ser, would you please shed a light how to solve this issue? Really appreciate!

Describe the bug
I clone the repo and install halmos but it complained about the path

To Reproduce
run halmos

Environment:

OS: [Windows]
Python version: [3.9.7]
Halmos and other dependency versions: [0.0.7]

Additional context
it complain about the directory path as following:

(base) c:\xxx\code\halmos-example>halmos
Traceback (most recent call last):
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\halmos\__main__.py", line 532, in main
    cryticCompile = CryticCompile(**vars(args))
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\crytic_compile\crytic_compile.py", line 110, in __init__
    self._compile(**kwargs)
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\crytic_compile\crytic_compile.py", line 530, in _compile
    self._platform.compile(self, **kwargs)
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\crytic_compile\platform\solc.py", line 154, in compile
    targets_json = _get_targets_json(compilation_unit, self._target, **kwargs)
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\crytic_compile\platform\solc.py", line 283, in _get_targets_json
    return _run_solc(
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\crytic_compile\platform\solc.py", line 483, in _run_solc
    raise InvalidCompilation(f"{filename} does not exist (are you in the correct directory?)")
crytic_compile.platform.exceptions.InvalidCompilation: c:\xxx\code\halmos-example does not exist (are you in the correct directory?)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\xxx\Anaconda3-2021-11\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\xxx\Anaconda3-2021-11\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "c:\xxx\Anaconda3-2021-11\Scripts\halmos.exe\__main__.py", line 7, in <module>
  File "c:\xxx\Anaconda3-2021-11\lib\site-packages\halmos\__main__.py", line 534, in main
    raise ValueError('Parse error', e)
ValueError: ('Parse error', InvalidCompilation('c:\\xxx\\code\\halmos-example does not exist (are you in the correct directory?)'))

Counterexample in json output

Would be great to have the counterexample for the failed test case in the extended json output

Print version number with `--version`

Is your feature request related to a problem? Please describe.
I'd like to get the version of halmos so so when benchmarking using https://github.com/eth-sc-comp/benchmarks I can get the version programmatically. This would allow me to compare the performance of different halmos versions with ease

Describe the solution you'd like
A new option --version that prints the current version number. A git hash would also be nice, but not necessary.

Describe alternatives you've considered
I could get the version from nix I think. But it may be helpful to users to know their version number, too.

Additional context
I am trying to check the performance of different symbolic/concolic EVM execution engines.

Halmos Keccak producing incorrect values

Describe the bug
Halmos keccak256 seems to mismatch the underlying bytecode representation

To Reproduce

function check_keccak_matching() public {
    bytes32 a = keccak256(hex"01");
    bytes32 b = keccak256(abi.encodePacked(hex"01"));
    assert(a == b);
}

function check_keccak_matching_2() public {
    bytes32 a = keccak256(hex"01");
    bytes32 b = keccak256(abi.encodePacked(uint8(1)));
    assert(a == b);
}

halmos --contract HERC721Test --print-full-model --print-failed-states

Running 2 tests for test/H_ERC721.t.sol:HERC721Test
[FAIL] check_keccak_matching() (paths: 1/3, time: 0.01s, bounds: [])
Counterexample: 
    msg_value = 0x0
    sha3_8 = [else -> 0xdfe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2]
# 2 / 3
PC: 0xaaaa0001 0x34d REVERT
Stack: [0x181, sha3_8(0x1), 0x5fe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2, 0xda, 0x50cb53c7]
Balance: balance_0
Storage:
- 0xaaaa0001: {0x0: {0x0: 0x10001}}
Path:
- msg_value == 0x0
- Not(sha3_8(0x1) == 0x5fe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2)
Output: 4e487b710000000000000000000000000000000000000000000000000000000000000001
Log: []
Balance updates:
Storage updates:
SHA3 hashes:
- 0x0: sha3_8(0x1)
External calls:

[FAIL] check_keccak_matching_2() (paths: 1/3, time: 0.01s, bounds: [])
Counterexample: 
    msg_value = 0x0
    sha3_8 = [else -> 0xdfe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2]
# 2 / 3
PC: 0xaaaa0001 0x34d REVERT
Stack: [0x181, sha3_8(0x1), 0x5fe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2, 0xda, 0x72430ece]
Balance: balance_0
Storage:
- 0xaaaa0001: {0x0: {0x0: 0x10001}}
Path:
- msg_value == 0x0
- Not(sha3_8(0x1) == 0x5fe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2)
Output: 4e487b710000000000000000000000000000000000000000000000000000000000000001
Log: []
Balance updates:
Storage updates:
SHA3 hashes:
- 0x0: sha3_8(0x1)
External calls:

Symbolic test result: 0 passed; 2 failed; time: 0.03s

seems that halmos is producing 0xdfe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2 whereas evm produces 0x5fe7f977e71dba2ea1a68e21057beebb9be2ac30c6410aa38d4f3fbe41dcffd2

Environment:

OS: macOS
Python version: Python 3.11.3
Halmos and other dependency versions:
Package Version

halmos 0.1.2
pip 23.1.2
setuptools 67.7.2
wheel 0.40.0
z3-solver 4.12.2.0

Values for storage in counterexample

Would be cool to have a value for relevant storage slots in the counter example when using --symbolic-storage

External call encountered an issue at DELEGATECALL

Issue Description
I have a simple toy example which includes a backdoor proof, and it works as expected. However, if I make this contract upgradeable using OpenZeppelin UUPSUpgradeable, I get the following warning after the backdoor prove fails:
WARNING:Halmos:CALL failed: External call encountered an issue at DELEGATECALL: External call encountered an issue at CALLDATALOAD: symbolic CALLDATALOAD offset: 4 + Extract(7935, 7680, halmos_data_bytes_02).

I attached 2 contracts: SimpleContract would pass in all cases, but SimpleContractUpgradeable will fail at prove4_NoBackdoor.

Each contract also have a commented-out method named backdoor, which should cause the NoBackdoor prove to fail in both cases. For SimpleContract, it fails cleanly as expected with a counter example. For SimpleContractUpgradeable it similarly fails with a counter example, but it also includes the same error as above. Hopefully this also give some perspective on the root cause.

To Reproduce
Attachements SimpleContract.sol and SimpleContractUpgradeable.sol (with txt file extensions)

Environment:

OS: Ubuntu 22.04 VM
Python version: Python 3.11.4
Halmos and other dependency versions:
Package Version

aiohttp 3.8.4
aiosignal 1.3.1
async-timeout 4.0.2
attrs 23.1.0
bitarray 2.7.6
blinker 1.4
cbor2 5.4.6
certifi 2023.5.7
charset-normalizer 3.1.0
command-not-found 0.3
cryptography 3.4.8
crytic-compile 0.3.2
cytoolz 0.12.1
dbus-python 1.2.18
distro 1.7.0
distro-info 1.1build1
eth-abi 4.1.0
eth-account 0.9.0
eth-hash 0.5.2
eth-keyfile 0.6.1
eth-keys 0.4.0
eth-rlp 0.3.0
eth-typing 3.4.0
eth-utils 2.1.1
frozenlist 1.3.3
halmos 0.1.2
hexbytes 0.3.1
httplib2 0.20.2
idna 3.4
importlib-metadata 4.6.4
jeepney 0.7.1
jsonschema 4.17.3
keyring 23.5.0
launchpadlib 1.10.16
lazr.restfulclient 0.14.4
lazr.uri 1.0.6
lru-dict 1.2.0
more-itertools 8.10.0
multidict 6.0.4
netifaces 0.11.0
oauthlib 3.2.0
packaging 23.1
parsimonious 0.9.0
pip 22.0.2
prettytable 3.8.0
protobuf 4.23.3
pycryptodome 3.18.0
PyGObject 3.42.1
PyJWT 2.3.0
pyparsing 2.4.7
pyrsistent 0.19.3
python-apt 2.4.0+ubuntu1
PyYAML 5.4.1
regex 2023.6.3
requests 2.31.0
rlp 3.0.0
SecretStorage 3.3.1
setuptools 59.6.0
six 1.16.0
slither-analyzer 0.9.5
solc-select 1.0.4
systemd-python 234
toolz 0.12.0
ubuntu-advantage-tools 8001
ufw 0.36.1
unattended-upgrades 0.1
UNKNOWN 0.1.4
urllib3 2.0.3
wadllib 1.3.6
wcwidth 0.2.6
web3 6.5.0
websockets 11.0.3
wheel 0.37.1
yarl 1.9.2
z3-solver 4.12.2.0
zipp 1.0.0

Additional context
The *.sol files uses Hardhat import references.
SimpleContract.t.sol.txt
SimpleContractUpgradeable.t.sol.txt

Add coverage report

Halmos systematically explores as many paths as possible, we should add some option to produce coverage reports.

Some unordered thoughts:

we operate at the bytecode level, should we print a report about bytecode-level coverage or try to report something about source level information? (lines, functions, branches, etc)
what about bytecode metadata, constructor arguments and other non-executable stuff?
how hard would it be to produce an lcov report for integration in CI? for instance the farcaster repo has a great coverage CI integration 👌
at a minimum, a summary like this would be neat:

 Summary coverage rate:
  lines......: 98.8% (238 of 241 lines)
  functions..: 98.5% (64 of 65 functions)
  branches...: 98.5% (128 of 130 branches)

Support for missing EVM opcodes

Opcodes to support:

Support for Foundry EVM cheatcodes

Does Halmos support Foundry EVM cheat codes?
Error: unknown constant evm_div

Support configuration via halmos.toml

Problem

Currently, halmos only supports configuration via command line arguments. There are problems with this:

the number of arguments is growing quite a bit, and likely to continue doing so
after 3 or 4 arguments, the command line gets unwieldy
knowledge about how to run halmos on a given repository gets lost

Solution

Having a config file that one can check in to version control would help with all these.

Halmos would first look for that config file, load these settings as a baseline, and then use command line arguments as an override.

Alternatives

Couple options:

use halmos.toml, modeled after foundry.toml. We can similarly imagine having different profiles, where we can do light checks on dev machines but longer checks on CI
use a [symbolic] profile in foundry.toml, an approach seen in https://github.com/baolean/symexec-bench/blob/symtest/SymTest/foundry.toml

(2.) is nice because it lets developers use a well known format, and the [symbolic] section feels right at home next to [fuzz], but I think overall I prefer (1.) because it reduces coupling with foundry. Foundry is nice, but it would seem weird to force users to use it if we support different frontends in the future (e.g. vyper)

add int blasting

improve nonlinear arithmetic reasoning using int-blasting

Support multiple setup states, possibly in parallel

Problem

Currently, the setUp() function is required to produce a single non-reverting output state, which is then fed into every test as the input state. If multiple states are generated by setUp(), one of them is randomly chosen, while the others are ignored, and a warning message is generated.

Support for multiple setup states is necessary, as certain initialization functions may branch over their symbolic parameters. Also, we may want to nondeterministically choose which configuration functions are called (e.g., whether the target contract is paused or not).

However, there is a concern with multiple setup states due to performance issues. The total number of paths for each test is multiplied by the number of setup states. If many setup states are accidentally generated, the test can take extremely long to complete.

Proposal

Allow multiple setup states, but run them in parallel by default. Specifically, for each setup state, a new process or thread is spawned to run the tests. If the --test-parallel option is provided, this will lead to N * M parallel processes or threads, where N is the number of setup states, and M is the number of tests.

To prevent accidental path explosion due to incorrect setup, a configurable upper bound can be set. If the number of setup states exceeds this bound, the test will be stopped.

Technical consideration

If both the --test-parallel and --solver-parallel options are provided, three layers of parallelism will be introduced: one for each setup state, one for each test, and one for each assertion solving. Would it be technically feasible to have this multi-layered parallel processing?

If only the current two-layer approach is feasible, then a new process could be spawned for each (setup, test) pair, and a new thread could be spawned for each assertion solving. What do you think, @karmacoma-eth?

Halmos throws `Counterexample: unknown` warning on solc versions < 0.8.17

Describe the bug
I was recently trying to replicate Solmate's SignedWadMath.sol contract's wadMul() bug (see @karmacoma-eth's tweet). When I ran halmos --smt-div the tests PASSED with the following output:

Expected behavior: The test should fail.

To Reproduce

Use this code: https://gist.github.com/PraneshASP/352924cdbf6935cb73631eb73bd5f273
Set solc version in your foundry.toml to 0.8.15.
Run halmos --smt-div

Environment:

OS: Linux - Ubuntu 20.04
Python version: Python 3.9.17
Halmos and other dependency versions: Halmos 0.1.3.dev3+gd2c5d27

Additional context
Halmos produced expected counterexamples when solc version was >= 0.8.17

Improve testing of EVM correctness

Problem

Halmos is essentially a light EVM interpreter (without gas), and because we're in the business of looking for corner cases of EVM bytecode, it is actually mission critical for our modeling to be as correct and complete as possible. However, we're not doing a ton to ensure that we're actually conforming to the spec.

Solution

The most complete thing we could do is make sure that we have good coverage of the EVM state transition tests, which is something github.com/ethereum/execution-specs/ does (as well as KEVM from what I've heard)

However this is going to be difficult in practice because:

the state tests are big and hairy
we don't implement gas and are not likely to, so we would need to focus on the subset of official tests that don't require that (if any)

So on the spectrum from very-lightweight to fully-compliant testing, we are currently too close to very-lightweight and would like to move meaningfully towards fully-compliant, while remaining pragmatic.

Alternatives

Maybe we can get away with unit testing for specific corner cases at the instruction level, e.g.:

dividing by 0 returns 0
reading past calldatasize returns 0s
reading past codesize returns 0s
a failed ecrecover returns 0 bytes (via @0xPhaze tweet)
...

Extra context

This is actually a massive task, but marked good first issue because there is a well defined spec out there and it is possible to make incremental progress while keeping halmos basically a black box.

Ability to ignore tests

As per #15 and #30 there are certain features that halmos doesn't currently support.

Refactoring fuzz tests to a form that halmos supports is not always beneficial or possible, for example in #30 the suggestion was to avoid dynamic arrays and use fixed size arrays instead. In the code I was testing this would result in loss of generality of the space the fuzz tester is scanning from all possible length arrays to one or several examples of specific length arrays.

I'd like to be able to keep my more general fuzz test, that can go more places that halmos can, albeit with weaker guarantees, but then tell halmos to ignore those tests.

I could then use halmos to provide stronger proofs for the tests that it does support, which is better than not being able to use halmos at all (seems to be current state of things) if i fuzz something unsupported anywhere in my repo.

Path constraints in json output

Would be great to have the whole end state in some nice json format, when using --extended-json-output. Especially the path, the constraints that led to the counterexample

{
"Stack": ["0xc54446db"],
"Balance: "balance_04",
"Path": ["ULE(0x4, some_paramater)", ...]
... other attribtues...
}

Better support for string comparison patterns

when a symbolic string is compared with different-sized strings, fixing the size of symbolic strings is not convenient, e.g.,:

function check_foo(string memory x) {
    if (keccak256(abi.encodePacked(x)) == keccak256(abi.encodePacked("ab"))) ...
    else if (keccak256(abi.encodePacked(x)) == keccak256(abi.encodePacked("cdef"))) ...

special handling of string type symbols or support for symbolic sized arrays is desired.

Improve unsupported cheatcode error messages

When an unsupported cheatcode is used, the error message is not very helpful in determining which cheatcode is causing the issue.

For example, if vm.expectRevert() is used, the following error message is printed:

Not supported: CALL Unsupported cheat code: calldata: 4102309908

This can be improved by creating a map from calldata to cheatcode.

default extended json output

--extended-json-output is more useful in most cases. make it default, and have a separate option that prints a minimal json output.

Support for custom NatSpec annotations

Problem

Suppose you have a codebase with multiple test contracts:

you can run the tests in ContractA with the default settings
one test in ContractB requires --loop 10
one test in ContractB requires --smt-div

In order for halmos to run the entire test suite, you resort to running halmos --loop 10 --smt-div which makes the tests in ContractA 10x slower.

Solution

It would be amazing if we could support something like custom NatSpec tags at the contract and function level, hypothetical example:

/// @custom:halmos --loop 10
contract B {
    function check_loops() public {
        ...
    }

    /// @custom:halmos --smt-div
    function check_loopsAndDiv() public {
        ...
    }
}

(not sure about the exact syntax we want -- just a suggestion)

The main point is that this would enable these settings only for the relevant functions and contracts, not the entire test suite.

So in our hypothetical example, you could just run halmos, and:

the tests in ContractA would run fast with the default settings
all the tests in ContractB would run with --loop 10
only check_loopsAndDiv would run with --loop 10 --smt-div

Running with python3.8 results in error

The following error, specifically:

/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.13) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Traceback (most recent call last):
  File "/home/hjort/.local/bin/halmos", line 8, in <module>
    sys.exit(main())
  File "/home/hjort/.local/lib/python3.8/site-packages/halmos/__main__.py", line 316, in main
    args = parse_args()
  File "/home/hjort/.local/lib/python3.8/site-packages/halmos/__main__.py", line 31, in parse_args
    parser.add_argument('--use-srcmap', action=argparse.BooleanOptionalAction, default=True, help='use source mappings')
AttributeError: module 'argparse' has no attribute 'BooleanOptionalAction'

It seems at least python3.9 should be required, based on this:
https://nono.ma/attributeerror-module-argparse-has-no-attribute-booleanoptionalaction

test contracts initialization

Describe the bug
Halmos doesn't properly initialize test contracts.

To Reproduce

contract FooTest {
  uint public value = 1;
  function testValue() public {
    assert(value == 1); // fail
  }
}

Additional context
Executing the contract creation code for test contracts is needed.

Create a test that demonstrates the payable multicall vulnerability

The issue is described here:

Uniswap/v3-periphery#52

can halmos find the problem and produce a counterexample? If it does, we should consider it for inclusion under tests/

Add option to specify when we expect a counterexample

Currently:

tests with no counterexamples return 0 and a green [PASS]
tests with counterexamples, or unknown results return non-0 and a red [FAIL]

This makes running the test suite a little scary, because we see a mix of [PASS] and [FAIL] but sometimes [PASS] can be bad (if we expected a counterexample) and [FAIL] can be bad (if we didn't expect a counterexample).

We should add the --expect-counterexample option (open to better naming suggestions):

tests that do have 1 or more counterexamples return exitcode 0 and a green [XFAIL] (for "expected fail" -- we found the thing we're looking for)
tests that fail with no counterexamples are still a red [FAIL] (it fails for unexpected reasons), they return exitcode != 0
tests that would normally be a green [PASS] are now a red [XPASS] (for "unexpected pass" -- we didn't find the thing we're looking for), they return exitcode != 0

Similarly, we should have an --expect-fail option that behaves similarly but for tests that fail with no counterexamples.

Then we can add the right /// @custom:halmos --expect-counterexample annotations to solidity tests and running halmos --root tests should output a mix of green [PASS] and [XFAIL] and return 0

a16z / halmos Goto Github PK

halmos's Introduction

Halmos

Installation

Usage

Examples

Contributing

Disclaimer

halmos's People

Contributors

Stargazers

Watchers

Forkers

halmos's Issues

Problem

Solution

Problem

Solution

Problem

Solution

Finer points

Problem

Solution

Problem

Solution

Alternatives

Problem

Solution

Alternatives

Extra context

Problem

Solution

Recommend Projects

Recommend Topics

Recommend Org