The execution-spec-tests from ethereum

feature: Fill using multiple `evm` binaries and compare results

In order to mitigate the lack of state test format, I propose allowing to fill all tests using multiple evm binaries and make a comparison of all results on each call.

This would allow us to indirectly test evmone by filling the test and comparing the results against geth's evm.

We can compare the state root and all accounts, as well as any other values deemed necessary.

I'd like to hear your thoughts on whether this would be useful or not @chfast @winsvega

Fixtures.zip

While importing the tests into the geth repository I noticed that the release only provides the linux version not a windows version for the fixtures.
Additionally it would be great to modify the naming of the fixtures to include the version number. E.g. fixtures-0.2.4.tar.gz and fixtures-0.2.4.zip

Rel: ethereum/go-ethereum#26985

Altering default `timestamp` value for the environment in `StateTest` breaks pre-merge tests

Currently the Environment property is supposed to describe the environment where the transactions are executed, and in the case of the state tests, this is Block 1.

If the test specifies an env.Timestamp or env.Difficulty in a state test, the filler should be able to come up with the appropriate genesis values such that the requested environment values are valid for block 1.

Currently it does not do that, but the tests nevertheless work because the default values are env.Difficulty=0x20000, which is the minimum difficulty value, and env.Timestamp=1000.

EIP-4788 Spec Change Tracker

Spec: https://eips.ethereum.org/EIPS/eip-4788

To change in tests.

Beta Give feedback

Update to Precompile Address: ethereum/EIPs#7173
Bound state growth of beacon root contract: https://github.com/ethereum/EIPs/pull/7126/files
Bound precompile storage: https://github.com/ethereum/EIPs/pull/7178/files
Options

single test execution command

tf --output="fixtures" --testfile ./fillers/withdrawals/withdrawals.py --outputfolder ./filled/withdrawals

for single test generation or debug.

feat(tools): improve solc error messages

if yul or solidity code failed to compile the error messaging should be follow:

message if solc not detected when filling a solc require test filler
message that the test required a different solc version to the one that found
message the exact code that failed to compile with found solc version

I suggest to maintain all tests to use the latest version of solc to avoid solc versioning managment.

Expected exception format

                "expectException" : "Transaction without funds",

This string is a variable name from the map: https://github.com/ethereum/retesteth/blob/master/retesteth/configs/clientconfigs/t8ntool.cpp#L54

The actual value is taken from the map. why map? Because different clients might have different exception texts. So if a dev wants to check the exception they make their own map translating TR_NoFunds code for example into Transaction without funds.

Create multiple transaction classes for each type

          We could create separate class types for each Transaction at some point in the future, i.e TransactionType0, ..., TransactionType3 that are members of Transaction to avoid duplication

This could be useful when writing tests later down the line to improve readability, maintenance, etc

Originally posted by @spencer-tb in #167 (comment)

output fixtures have tests with the same names

withdrawals_use_value_in_contract.json has test vector "000_shanghai" : {
withdrawals_many_withdrawals.json has test vector "000_shanghai" : {

this will confuse when we try to speak about errors in 000_shanghai test

Release fixture renaming

It would be great to modify the naming of the fixtures to include the version number. E.g. fixtures-0.2.4.tar.gz

Update withdrawals fillers to use gwei amounts instead of wei

PR ethereum/EIPs#6325 updates the amounts in a withdrawal to match the units used by the CL, gwei.

This change must be accounted for in the balance increments for all tests in fillers/withdrawals/withdrawals.py.

Fix the way BlockchainTest and StateTest handle invalid transactions

Right now with #2, intrinsically invalid transactions (not reverting transactions) are removed from the block before execution.

We need to change this behavior to leave the transactions in the block to properly test that blocks with a certain type of invalid transactions are correctly rejected.

This affects the way that StateTest operates, as it bundles all transactions into a single block and then compares the cumulative post-state that results from the execution of all transactions in sequence.

This means that if we leave invalid transactions in the block, the entire block will be rejected and nothing will be tested.

We could solve this issue of two ways.

Assuming that we have Tx1..TxN within a StateTest, and also each Tx is contained in a separate block, we can have:

a)

flowchart LR;
G --> B1[Tx1];
B1[Tx1] --> B2[Tx2];
B2[Tx2] --> ...;
... --> BN[TxN];

Where each transaction block is on top of the previous transaction.

b)

flowchart TD;
G --> B1[Tx1];
G --> B1'[Tx2];
G --> ...;
G --> B1'''[TxN];

Where all blocks are at height G+1

(a) makes writing state tests more complicated, because the tester has to account for the previous transactions to calculate the full post-state. With (b), every transaction is isolated and writing the verification post-state is easier (every transaction could have a different verification post-state).

Release fixture renaming

It would be great to modify the naming of the fixtures to include the version number. E.g. fixtures-0.2.4.tar.gz

output fixtures has empty pre or postState

in withdrawals_use_value_in_tx.json
in "000_shanghai" : {

the pre and postState are empty.
Current test base has no empty states whatsoever.
It good to define the sender account as empty or having 0 balance for explicit checks

Update exact gas specified for warm coinbase contract calls in eip3651 fillers

Commit 0670954 refactored the call functions under test in test_warm_coinbase_call_out_of_gas to use the opcode format instead of specifying the calls as Yul. The exact gas usage specified for each call function now appears to be too generous as the opcode format's bytecode is more optimized than that generated by solc from the Yul format (I assume).

Although this is unlikely to effect the test outcome (as the difference between calls to cold and warm accounts is much larger than the discrepancy described above; 2600 vs 100 gas) it seems worthwhile to update the exact gas usage in this test.

FYI: The pytest port of this test introduced a new test that additionally tests that these calls indeed fail if exact_gas_usage - 1 is specified as the call gas for the subcall, see 79ed2a2

Feature request: forknames, chainid as options

We have ETC classic team who need this feature. They use different forknames and chainid.
would be nice to have a config file options that specify following:

t8n tool to use (path or smth)
chainid to generate transactions with.
forknames map (testName -> actualName that is sent to t8n)

or support it as cmd options when running the test generation.
this is a request feature low priority.

the genesis state root is incorrect (state reward must be -1)

Apparently there is a difference in waht genesis is used in generated blockchain tests:

Error: Importing raw RLP block, block was expected to be valid! (if it was intended, check that it is not in Valid blocks test suite) Error importing raw rlp block: ToolChainManager:: unknown parent hash 0x01e7fb1a4d848f80871bfea58f87511cc1cf8a09afaaeab66fd35281aeb76491

I reject the block because genesis block hash is different. The genesis blockhash is important as it contains the pre state. if you calculate your block on a different genesis we have disagreement.

To compute genesis blockhash I take what is needed for env file from this info

        "genesisBlockHeader" : {
            "bloom" : "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
            "coinbase" : "0x0000000000000000000000000000000000000000",
            "difficulty" : "0x020000",
            "extraData" : "0x00",
            "gasLimit" : "0x016345785d8a0000",
            "gasUsed" : "0x00",
            "hash" : "0x01e7fb1a4d848f80871bfea58f87511cc1cf8a09afaaeab66fd35281aeb76491",
            "mixHash" : "0x0000000000000000000000000000000000000000000000000000000000000000",
            "nonce" : "0x0000000000000000",
            "number" : "0x00",
            "parentHash" : "0x0000000000000000000000000000000000000000000000000000000000000000",
            "receiptTrie" : "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
            "stateRoot" : "0x9d539b791e8d942b88859bc2954a489e2102b00b7fd7106c4f769e0e860596fd",
            "timestamp" : "0x00",
            "transactionsTrie" : "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
            "uncleHash" : "0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347"
        },

feed it to the tool to calculate stateRoot from given pre and then calculate the blockhash

Transaction data export in .json is wrong

here my check fails:
in withdrawals test in self_destructing_account.json the transaction json is

 "transactions" : [
                    {
                        "data" : "0x0200",
                        "gasLimit" : "0x0186a0",
                        "gasPrice" : "0x0a",
                        "nonce" : "0x00",
                        "secretKey" : "0x45a915e4d060149eb4365960e6a7a45f334393093061116b197e3240065ff2d8",
                        "to" : "0x0000000000000000000000000000000000000100",
                        "value" : "0x00"
                    }
                ],

but the one enocded in RLP is

RLP transaction number: 1
{
    "data" : "0x0000000000000000000000000000000000000000000000000000000000000200",
    "gasLimit" : "0x0186a0",
    "gasPrice" : "0x0a",
    "nonce" : "0x00",
    "to" : "0x0000000000000000000000000000000000000100",
    "value" : "0x00",
    "v" : "0x26",
    "r" : "0xfd75940662272e5daac655a822b10f101866916a2975c3821111ad4ab481d48b",
    "s" : "0x67a11d0fcd31933ad4b62b4d6fd4f1540f93406d501e41b82ff59223303f60c5"
}

pyspec removed leading zeroes in data field when writing rlp data to json.

`genesisRLP` is missing withdrawals

The genesisRLP of Shanghai withdrawal tests decodes to 3 items rather than 4 i.e. they might be missing the withdrawals.

Refer the tests in the under fixtures/withdrawals/withdrawals in release v0.2.2

Attempting to fill 4844 tests raises exception

command: tf --filler-path="fillers" --output="fixtures" --test-categories eips.eip4844 --force-refill

result:

Exception during test 'test_invalid_blob_txs type_3_tx_pre_fork'
Traceback (most recent call last):
  File "/Users/jwasinger/projects/execution-spec-tests/venv/bin/tf", line 8, in <module>
    sys.exit(main())
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_filling_tool/main.py", line 110, in main
    filler.fill()
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_filling_tool/filler.py", line 67, in fill
    future.result()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_filling_tool/filler.py", line 188, in fill_fixture
    fixture = filler(t8n, b11r, "NoProof", module_spec)
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_tools/filling/decorators.py", line 45, in inner
    return fill_test(
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_tools/filling/fill.py", line 57, in fill_test
    raise e
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_tools/filling/fill.py", line 47, in fill_test
    (blocks, head, alloc) = test.make_blocks(
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_tools/spec/blockchain_test.py", line 265, in make_blocks
    fixture_block, env, alloc, head = self.make_block(
  File "/Users/jwasinger/projects/execution-spec-tests/src/ethereum_test_tools/spec/blockchain_test.py", line 139, in make_block
    (next_alloc, result, txs_rlp) = t8n.evaluate(
  File "/Users/jwasinger/projects/execution-spec-tests/src/evm_transition_tool/__init__.py", line 222, in evaluate
    raise Exception("failed to evaluate: " + result.stderr.decode())
Exception: failed to evaluate: ERROR(10): failed unmarshaling stdin: transaction type not supported

Using master branch of geth (dde2da0efb8e9a1812f470bc43254134cd1f8cc0) to fill tests.

Refactor Blockchain/StateTests (ethereum_test_tools/spec) for clearer Environment Abstration

From #37, via marioevz:

Looking at my code on make_block in BlockchainTests it seems like was really easy to make this mistake in the first place because of how the block and env are just mixed all over the place. I think it would be nice to refactor this a bit on a different PR to make a clear distinction that block structure is the source of the test information, but env is the structure that should contain all the correct information at some point.

I think this same mistake could be present in the StateTests, it's just that we don't have any withdrawals tests written in the StateTests format.

Define standard template scenarios for expected test outcomes (for use in test case descriptions)

about the x. here is the idea.
x is a set of actions that we want to test. Y is a vector of post conditions for F(x)=Y. we need a template that will take x
and execute it in different context (create tests). For each context i we need to verify that Fi(x)=Yi (transition for set of actions x on a template i makes post Yi)
The contexts are templates (but not limited to):

X is in transaction deployment code (create transaction)
X is in CREATE init code 
X is in CREATE2 init code 

X is in the contract code (being called by transaction)   and [OOG, REVERT, SELFDESTRUCT, NOTHING] happens
X is in the contract code (being called by CALL) and [OOG, REVERT, SELFDESTRUCT, NOTHING] happens
X is in the contract code (being called by CALLCODE) and [OOG, REVERT, SELFDESTRUCT, NOTHING] happens
X is in the contract code (being called by DELEGATECALL) and [OOG, REVERT, SELFDESTRUCT, NOTHING] happens
X is in the contract code (being called by STATICCALL) and [OOG, REVERT, SELFDESTRUCT, NOTHING] happens

generate single .py file

I see. this one worked

tf --output="./filled"   --filler-path="./src/"  --test-module=withdrawals

so --test-module refer to directory with .py files
if it is found inside ./src

but how to execute particular .py file?
also it output withdrawals/withdrawals/[tests]
what would be nice to have. is this:

--output="./filled"
--filler-file="./src/test.py"

and get in output ./filled/[test1, test2, test3]
all produced tests there.
what I am trying to do is to connect .py file with the generated .json files for version tracking

Investigate long compilation time of long byteocode with `solc --evm-version shanghai`

Upon using a newer version of solc and compiling the "long bytecode" test with --evm-version=shanghai, the test took much longer to complete - it actually looked like it had completely hanged. But it does terminate, cf #147.

Remove balance overflow test on withdrawals test suite

Test test_withdrawals_overflowing_balance attempts to overflow the balance of an account on a withdrawal, but the behavior is not defined in spec, and furthermore it's unrealistic.

Test should be removed in order to not cause confusion because this behavior is undefined and unrealistic.

4844: Add tests for invalid blob-related header fields

Tasks

Beta Give feedback

Add tests where either the excessDataGas or dataGasUsed fields exceed 64-bit length
Add tests where the type-3 tx rlp contains over-length fields
Add tests where type-3 tx has a nil to address
Options

EIP-4844 Spec Change Tracker

Spec changes tracker in order to keep track of the updated tests.

All tasks need to update legacy TF tests and pytest, in case pytest has not been merged yet.

Create type clases for `Address`, `Bytes`, etc.

          I really like these functions in the separate file!!

I think at some point we should consolidate on these conversions, and maybe create class types for Address & Hash to make conversions easier

I added some suggested changes/ideas at the end of types.py but I think in the long term we should try to make these as concise as possible or change some of the naming to avoid errors - i.e hex_or_none returns a hex string in one case so we could call it hex_string_or_none etc

Originally posted by @spencer-tb in #167 (comment)

Tox Failing: GitHub Actions Needs Payment or Increased Spending Limit

How do I get this working?

I followed the directions to install, and now when I try to run a test I get:

(venv) qbzzt1@spec-tests:~/execution-spec-tests$ tf --output="fixtures" --test-case dup
DEBUG:ethereum_test_filling_tool.main:searching example.example for fillers
DEBUG:ethereum_test_filling_tool.main:searching vm.chain_id for fillers
DEBUG:ethereum_test_filling_tool.main:searching vm.dup for fillers
DEBUG:ethereum_test_filling_tool.main:searching withdrawals.withdrawals for fillers
DEBUG:ethereum_test_filling_tool.main:searching eips.eip3651 for fillers
DEBUG:ethereum_test_filling_tool.main:searching eips.eip3855 for fillers
DEBUG:ethereum_test_filling_tool.main:searching eips.eip3860 for fillers
INFO:ethereum_test_filling_tool.main:collected 1 fillers
DEBUG:ethereum_test_filling_tool.main:filling dup
Exception during test ''
Traceback (most recent call last):
  File "/home/qbzzt1/execution-spec-tests/venv/bin/tf", line 8, in <module>
    sys.exit(main())
  File "/home/qbzzt1/execution-spec-tests/src/ethereum_test_filling_tool/main.py", line 180, in main
    filler.fill()
  File "/home/qbzzt1/execution-spec-tests/src/ethereum_test_filling_tool/main.py", line 140, in fill
    fixture = filler(t8n, b11r, "NoProof")
  File "/home/qbzzt1/execution-spec-tests/src/ethereum_test_tools/filling/decorators.py", line 67, in inner
    return fill_test(
  File "/home/qbzzt1/execution-spec-tests/src/ethereum_test_tools/filling/fill.py", line 55, in fill_test
    raise e
  File "/home/qbzzt1/execution-spec-tests/src/ethereum_test_tools/filling/fill.py", line 45, in fill_test
    (blocks, head, alloc) = test.make_blocks(
  File "/home/qbzzt1/execution-spec-tests/src/ethereum_test_tools/spec/state_test.py", line 111, in make_blocks
    (alloc, result, txs_rlp) = t8n.evaluate(
  File "/home/qbzzt1/execution-spec-tests/src/evm_transition_tool/__init__.py", line 209, in evaluate
    raise Exception("failed to evaluate: " + result.stderr.decode())
Exception: failed to evaluate: ERROR(3): EIP-1559 config but missing 'currentBaseFee' in env section

How do I add the proper field to the env? Better yet, how do I add it to the default environment?

add default vscode settings files

The execution-spec-tests Python code is black formatted which is checked via tox (and enforced in Github Actions). Without proper tooling this check is tough, especially for non-Python developers, who may not even be used to an enforced coding style (black).

VS Code's ms-python.python and ms-python.black-formatter allow the user to format the current file with black formatting in the editor, even automatically upon auto-save. Documenting and providing reasonable default settings that allow this could save a lot of pain for developers looking to contribute to tests, but don't use Python day-to-day.

Todo:

Reasonable default settings.
How to handle .vscode/ in .gitignore.
Settings file name: .vscode/settings.json or .vscode/settings.{default,recommended}.json, see also danceratopz#11 (comment).
Check usefulness of specifying an editorconfig to offer a solution to developers who don't use VS Code.
Can we and do we want to set a default Python interpreter/environment? That aligns with ./venvin the docs, for example.

danceratopz#11 introduced a first version of the settings (which seems to work well with no user settings $HOME/.config/Code/User/settings.json) , this ticket is to remind us to complete the job.

Fixtures missing from JSON when using pytest-xdist

The pytest-xdist allows concurrent execution of test cases. If we enable xdist when running pytest for test cases, certain fixtures are not written to file even though the same amount of tests are reported to be executed from python.

To reproduce:

rm -rf fixtures-without-xdist fixtures-with-xdist
fill -v --output=fixtures-without-xdist
fill -v -n auto --output=fixtures-with-xdist
ack -c "0\d\d-fork=" fixtures-without-xdist/ | awk -F: '{ total += $2 } END { print total }'
# -> 134
ack -c "0\d\d-fork=" fixtures-with-xdist/ | awk -F: '{ total += $2 } END { print total }'
# -> 35

Additionally, we should make the fork+test ordering and naming deterministic within fixtures. The top-level fork+test entries in the JSON fixtures are not deterministically ordered by fork+test. The goal would be to generate fixtures that are identical to those generated without using the xdist plugin.

Originally spotted by @spencer-tb!

Cancun Tracker

Tracker for CFI'd EIPs to be included in Cancun hardfork:

Finalized EIPS

Beta Give feedback

EIP 4844: EIP-4844: Shard Blob Transactions, Secondary Tracker for spec changes: #130
EIP-1153: Transient storage opcodes
EIP 6780: SELFDESTRUCT only in same transaction
EIP-4788: Beacon block root in the EVM
EIP-5656: MCOPY - Memory copying instruction
EIP-7516: BLOBBASEFEE opcode
Options

proposal: increase maximum allowed python line length to 100 or 120

Currently, execution-spec-test's black and isort are configured with line-length=79. This is the recommended value in PEP8 and the Google Python Style Guide uses a length of 80. PEP8 does state however:

Some teams strongly prefer a longer line length. For code maintained exclusively or primarily by a team that can reach agreement on this issue, it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters.

Within @ethereum:

The Snake Charmers use 100,
The execution-specs codebase is configured to 80.
The consensus-specs codebase is configured to use 120 (see e.g. blob/dev/tests/generators/kzg_4844/main.py).

I'd propose to increase line length as soon as the pytest refactor is complete, #116.

If there is some uncertainty whether to choose 100 or 120 we could format the codebase with both in order to make a better decision.

`--test-categories` option doesn't work

Executing the example: tf --output="fixtures" --test-categories vm/vm_arith/vm_add yields:

INFO:ethereum_test_filling_tool.filler:collected 0 fillers

test fixtures has wrong genesis RLP

On Shanghai genesis rlp missing withdrawals record. c0
This is reported by eth_getBlockByNumber(0) by the clients
But since we no longer have besu rpc, it is just a cosmetic check for those test runners who check its block0 (genesis) rlp representation agains the test.

'0xf9021cf90217a00000000000000000000000000000000000000000000000000000000000000000a01dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347940000000000000000000000000000000000000000a09f6753d92c23e2d853eb7fcba842c66bd55c15722f2649a416c512eb0c5d8825a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421b9010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000808088016345785d8a0000808000a0000000000000000000000000000000000000000000000000000000000000000088000000000000000007a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421c0c0' != 
'0xf9021df90217a00000000000000000000000000000000000000000000000000000000000000000a01dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347940000000000000000000000000000000000000000a09f6753d92c23e2d853eb7fcba842c66bd55c15722f2649a416c512eb0c5d8825a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421b9010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000808088016345785d8a0000808000a0000000000000000000000000000000000000000000000000000000000000000088000000000000000007a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421c0c0c0

(EIPTests/bc4895-withdrawals/000_shanghai, fork: Shanghai, block: 0)

Generated fixture not getting updated

I had some old tests, and I generated tests from .py file over it. and somehow python detected that tests already there and didn't update it.

Then I modified the exception field in generated test "expectException" : "TR_NoFunds"
to a correct one to pass the test and see what happens. the exception field didn't overwrite back when I refilled the test.

This is very dangerous as it can leave us with old outdated tests when we try to regenerate it.

RLP of Shanghai blocks should always have withdrawals list

Even when they don't contain withdrawals, RLP of such blocks should contain an empty list for withdrawals post-Shanghai. For example, block 2 in withdrawals_use_value_in_contract.json in Release v0.2.1 fixtures has RLP
0xf90282f9021aa08444423fbde249cfcd3c41bdf009368a8b77763ed9fe344db4a32b0ecdc8d7b3a01dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347942adc25665018aa1fe0e6bc666dac8fc2697ff9baa06ab526318fc1842c06d5440bf245b03112e15241ea5d1f9f013ff6cf81296376a08c500edaa09a8dce41c0dcce8ee2876f5f3150b8101d09faa9ba61db523507eaa0cda4364d9e43f15119746d96fb9d842997acbfa55f65b6d3f7a3b5f06cd250d1b9010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000800288016345785d8a000083012e731880a0000000000000000000000000000000000000000000000000000000000000000088000000000000000007a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421f862f860010a830186a0940000000000000000000000000000000000000100808026a025b1502ca1862804cfb6dca49ffb3cca3d8225a66df185ca857aab6f8a1f97a3a05d9178ca2d7a23b22e07ff2fe9a01ceb23a4acf041a0fcdf3d31752de802b74ec0
but it should end with c0c0 (one empty list for uncles and one empty list for withdrawals).

Also blocks in initcode_limit_contract_creating_tx_gas_usage.json suffer from this issue.

evm version not passed to solc when compiling yul

Currently, several fillers fail (for forks older than Shanghai) if ran with solc 0.8.20, which added Shanghai support.

This is due to the resulting bytecode containing an opcode that isn't recognized by the evm tool for older forks. For example, the bytecode from fillers/example/example_yul.py compiled with solc 0.8.20 contains PUSH0 which is not recognized by forks prior to Shanghai.

The contract code specified via the Yul helper class is compiled with solc using the flags --assemble - (the yul code is provided on stdin). This can be fixed by the evm version to compile via --evm-version=FORK:

  --evm-version version (=shanghai)
                       Select desired EVM version. Either homestead, 
                       tangerineWhistle, spuriousDragon, byzantium, 
                       constantinople, petersburg, istanbul, berlin, london, 
                       paris or shanghai.

This could be added as a parameter to Yul's assemble() method, but this call is tucked away. Might need a bit of clean-up to propagate the fork value to where it is needed.

To check: Whether some fillers call assemble() directly.

Error reporting is not informative

I deliberatly messed up the expect section in py filler and got this

Traceback (most recent call last):
  File "/home/wins/Ethereum/execution-spec-tests/venv/bin/tf", line 8, in <module>
    sys.exit(main())
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_filling_tool/main.py", line 104, in main
    filler.fill()
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_filling_tool/filler.py", line 61, in fill
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_filling_tool/filler.py", line 128, in fill_fixture
    fixture = filler(t8n, b11r, "NoProof")
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/filling/decorators.py", line 67, in inner
    return fill_test(
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/filling/fill.py", line 53, in fill_test
    raise e
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/filling/fill.py", line 43, in fill_test
    (blocks, head, alloc) = test.make_blocks(
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/spec/blockchain_test.py", line 269, in make_blocks
    raise e
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/spec/blockchain_test.py", line 266, in make_blocks
    verify_post_alloc(self.post, alloc)
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/spec/base_test.py", line 77, in verify_post_alloc
    account.check_alloc(address, got_alloc[address])
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/common/types.py", line 427, in check_alloc
    expected_storage.must_be_equal(actual_storage)
  File "/home/wins/Ethereum/execution-spec-tests/src/ethereum_test_tools/common/types.py", line 249, in must_be_equal
    raise Storage.KeyValueMismatch(
ethereum_test_tools.common.types.Storage.KeyValueMismatch: incorrect value for key 0x0000000000000000000000000000000000000000000000000000000000000002: want 0x00000000000000000000000000000000000000000000000000000000b2d05e00 (dec:3000000000), got 0x0000000000000000000000000000000000000000000000000000000077359400 (dec:2000000000)

It is not telling which test the error is occured. (spoiler test_balance_within_block)

Fixture Format: Add transactions/uncleHeaders/withdrawals for BlockchainTests

Add the following fields for BlockchainTests. Omit the withdrawals field unless >= Shanghai fork. WIP: PR #47

Unclear meaning 'protected' field exported in transactions

Error: BlockchainTestInFilled convertion error: BlockchainTestBlock convertion error: LegacyTransaction convertion error: Unexpected field 'protected' in config: TransactionLegacy

A python specific field is exported to generated tests transaction section:

"protected" : false,

Introduce an assembler to avoid typing opcodes in hex

    We don't necessarily need to do this here, but I think we should really get an assembler in here soon so avoid hand typing out hex opcodes :)

Originally posted by @lightclient in #13 (comment)

Add Spec Version to the Produced Fixtures

At the moment it is impossible to know which commit (either of the EIPs repo or the Execution Apis repo) is the written test based on.

A great addition to the output fixture format would be the specific commit of these repos from which the tests were modeled after.

For example:

or both.

This way everyone can easily identify that the tests are out of date.

Tox has a hidden failing testcase with no Exception Caught

Currently when running tox on main:

$ tox -e py3

there is one failing testcase without a caught exception. This potentially leads to tox passing successfully when it should not.

Exported test format not following the existing tests schema: Short Accounts

Issue 1
Short accounts:

Error: Expected field 'nonce' not found in config: Account 0x0000000000000000000000000000000000000100
"0x0000000000000000000000000000000000000100" : {
    "balance" : "0x0",
    "code" : "0x6000358031435550",
    "storage" : {
        "0x0000000000000000000000000000000000000000000000000000000000000001" : "0x000000000000000000000000000000000000000000000000000000003b9aca00",
        "0x0000000000000000000000000000000000000000000000000000000000000002" : "0x0000000000000000000000000000000000000000000000000000000077359400"
    }
} (EIPTests/bc4895-pyspecs020/000_shanghai, step: BlockchainTest)

In the tests all account fields are defined so no ambiguity

Don't install git submodule for execution specs dependency

Right now tox takes a very long time due to ethereum/tests submodule in the execution specs. Would be great to avoid that if possible.

py3: 290 W install_package_deps> python -I -m pip install 'black==22.3.0; implementation_name == "cpython"' 'ethereum@ git+https://github.com/ethereum/execution-specs.git' 'flake8-docstrings<2,>=1.6' 'flake8-spellcheck<0.25,>=0.24' 'flake8<4,>=3.9' 'isort<6,>=5.8' 'mypy==0.982; implementation_name == "cpython"' 'pytest-cov<3,>=2.12' 'pytest-xdist<3,>=2.3.0' 'pytest>=7.2' setuptools==58.3.0 types-setuptools==57.4.4 [tox/tox_env/api.py:421]
Collecting ethereum@ git+https://github.com/ethereum/execution-specs.git
  Cloning https://github.com/ethereum/execution-specs.git to /private/var/folders/ty/5cq1d6311cb7ggy9b6l092mc0000gp/T/pip-install-fs175whv/ethereum_5c59a4a099f64df29efcfb5502fcb23a
  Running command git clone --filter=blob:none --quiet https://github.com/ethereum/execution-specs.git /private/var/folders/ty/5cq1d6311cb7ggy9b6l092mc0000gp/T/pip-install-fs175whv/ethereum_5c59a4a099f64df29efcfb5502fcb23a
  Resolved https://github.com/ethereum/execution-specs.git to commit 95ef6bf957a4c66993168384cec395ef1a030026
  Running command git submodule update --init --recursive -q

Exported test format not following the existing tests schema: Value format

Pyscpecs export

"nonce": "0x0",

its a t8n tool format. we have different schema in tests, requested long time ago by teams

"0x00" for 0  "0x" for empty code
"0x01"  for 1.  so the data always padded to 1 byte

there was a discussion long time ago. I don't know if it is relevant to the testing team who initially asked about it. but we stick to this format

"balance": "0x432ae"    ->                   "balance": "0x0432ae"

but in storage we save the space

                    "0x0000000000000000000000000000000000000000000000000000000000000001": "0x000000000000000000000000000000000000000000000000000000003b9aca00",
                    "0x0000000000000000000000000000000000000000000000000000000000000002": "0x0000000000000000000000000000000000000000000000000000000077359400"

would be

                    "0x01": "0x3b9aca00",
                    "0x02": "0x77359400"

Release v0.1.0 chain format is wrong

Issue 1:

test format missing genesisRLP field

Issue 2:

genesis block has difficulty != 0, but also has withdrawalsRoot defined which makes it Merge. Merge block has difficulty = 0.

BlockchainTests should only have one failing block at most

The tests should be more restrictive in a way that every yielded test should contain:

At most 1 invalid block
It should always be the last block

For example, on test:

execution-spec-tests/fillers/withdrawals/withdrawals.py

Lines 63 to 108 in 4de323c

    
           def test_withdrawals_use_value_in_tx(_): 
        
               """ 
        
               Test sending a transaction from an address yet to receive a withdrawal 
        
               """ 
        
               pre = {} 
        
               tx = Transaction( 
        
                   # Transaction sent from the `TestAddress`, which has 0 balance at start 
        
                   nonce=0, 
        
                   gas_price=ONE_GWEI, 
        
                   gas_limit=21000, 
        
                   to=to_address(0x100), 
        
                   data="0x", 
        
               ) 
        
               withdrawal = Withdrawal( 
        
                   index=0, 
        
                   validator=0, 
        
                   address=TestAddress, 
        
                   amount=tx.gas_limit + 1, 
        
               ) 
        
               blocks = [ 
        
                   Block( 
        
                       txs=[tx.with_error("intrinsic gas too low: have 0, want 21000")], 
        
                       withdrawals=[ 
        
                           withdrawal, 
        
                       ], 
        
                       exception="Transaction without funds", 
        
                   ), 
        
                   Block( 
        
                       txs=[], 
        
                       withdrawals=[ 
        
                           withdrawal, 
        
                       ], 
        
                   ), 
        
                   Block( 
        
                       txs=[tx], 
        
                       withdrawals=[], 
        
                   ), 
        
               ] 
        
               post = { 
        
                   TestAddress: Account(balance=ONE_GWEI), 
        
               } 
        
               yield BlockchainTest(pre=pre, post=post, blocks=blocks)

We produce three blocks where the first block is invalid, and then two invalid blocks, but this test could stick to these restrictions by yielding two times: once with only the invalid block, and afterwards with the two valid blocks.

Also, ideally, the filler should catch and restrict this behavior and throw an error when we try to fill a test that does not follow this convention.

	def test_withdrawals_use_value_in_tx(_):
	"""
	Test sending a transaction from an address yet to receive a withdrawal
	"""
	pre = {}

	tx = Transaction(
	# Transaction sent from the `TestAddress`, which has 0 balance at start
	nonce=0,
	gas_price=ONE_GWEI,
	gas_limit=21000,
	to=to_address(0x100),
	data="0x",
	)

	withdrawal = Withdrawal(
	index=0,
	validator=0,
	address=TestAddress,
	amount=tx.gas_limit + 1,
	)

	blocks = [
	Block(
	txs=[tx.with_error("intrinsic gas too low: have 0, want 21000")],
	withdrawals=[
	withdrawal,
	],
	exception="Transaction without funds",
	),
	Block(
	txs=[],
	withdrawals=[
	withdrawal,
	],
	),
	Block(
	txs=[tx],
	withdrawals=[],
	),
	]
	post = {
	TestAddress: Account(balance=ONE_GWEI),
	}

	yield BlockchainTest(pre=pre, post=post, blocks=blocks)

ethereum / execution-spec-tests Goto Github PK

execution-spec-tests's People

Stargazers

Watchers

Forkers

execution-spec-tests's Issues

To change in tests.

Tasks

Tasks

Finalized EIPS

Recommend Projects

Recommend Topics

Recommend Org