Coder Social home page Coder Social logo

gnosischain / consensus-deployment-ansible Goto Github PK

View Code? Open in Web Editor NEW

This project forked from parithosh/consensus-deployment-ansible

3.0 3.0 1.0 171.69 MB

Deployment configuration for creating test environments.

Shell 7.94% Python 87.24% Jinja 3.33% HTML 0.10% Makefile 0.17% JavaScript 1.22%
ethereum merge testnet

consensus-deployment-ansible's People

Contributors

cbermudez97 avatar coincashew avatar dadepo avatar dapplion avatar davidalbela avatar dv8silencer avatar garyschulte avatar giacomolicari avatar jflo avatar jgresham avatar jmederosalvarado avatar marekm25 avatar parithosh avatar protolambda avatar riccardo-gnosis avatar samcm avatar skylenet avatar timjp87 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

consensus-deployment-ansible's Issues

gc-merge-devnet-1 TBD-issue Jul 3


Nethermind (user list) and Lighthouse (user list) devs can ssh to hosts (ip list here) ssh devops@<ip>. Same as Pari's deployments


  • gc-merge-devnet-1 was launched at Jun 30 2022 12:05:00 GMT+0000 #7. 10 validating clients, diversity on CL; EL all nethermind, each has 1000 keys, network of 10000 validators total. The first two nodes are POSDAO validators.
  • TTD was reached at Jun 30 2022 13:56:30 GMT+0000
  • At Jul 2 2022 20:00:00 GMT+0000 participation dropped to 90%, as lodestar-nethermind-1 started to experience issues
  • At Jul 4 2022 15:00:00 GMT+0000 participation dropped to 60%. Some nodes CL or/and EL stopped progressing and lost all peers

Screenshot from 2022-07-05 13-21-01
Screenshot from 2022-07-05 13-21-35
Screenshot from 2022-07-05 13-21-50
Screenshot from 2022-07-05 13-22-11
Screenshot from 2022-07-05 13-22-24


NOTE: Currently digging in the logs, will post updates

Assign role to network health monitor

Gnosis chain must have a person of team fully commited to detecting network issues in a short timeframe. This infra probably already exists, but client teams and users would benefit from an explicit description of who this team is and what methods do they employ to achieve fast responses.

Nethermind also wants to contribute to that role but must be assumed as a backup resource only.

Setup denver testnet TODO

Details:

  • Deploy execution layer: #32
    • Genesis time: Jul 26 2022 18:54:55 GMT+0000
    • Genesis hash: 0x203e164cf3b6f6765abf2f0355a4d09e5eaf6a777243bc4407b569431cd95cb3
    • Initial AuRa validator set: 10

Outline of tasks to setup fully featured testnet.

Execution

AuRa validator set should be of similar size to current GC mainnet

  • Distribute keys after genesis and expand set before TTD

Execution layer must have all components AuRa only that affect consensus

  • Deploy native wrapped token xDAImerge in Goerli
  • Deploy ERC20 to native bridge (absolute minimum)
  • ERC20 to ERC20 bridge (recommended)
  • Test bridge uses, key rotation
  • Test remaining code paths of the BlockRewards contract

Execution layer must test most common uses of mainnet

  • AMMs: either Uniswap, Sushiswap, Honeyswap
  • POAP
  • DarkForest

Regarding infrastructure, deploy (besides explorer)

  • Faucet, to finance usage

Misc tests to do:

  • Spam the network post-merge
  • Run Nethermind in fast-sync, snap-sync, and archive through merge and post merge
  • Run POSDAO test suite post-merge

Consensus

We want to open up to external participants:

  • Distribute genesis keys to interested entities. WIP @dapplion
  • Deploy deposit contract token gated
  • Setup accessible way to get tokens and deposits

Entities interested to get GBC testnet keys

  • Nethermind
  • Erigon
  • Existing validators from mainnet if show proof of ownership
  • Large stakers: Stakewise, Kleros

Regarding infrastructure, deploying current scripted infra is sufficient (beaconchain explorer)

Misc tests to do:

Extra

  • Take over gbc-prysm image
  • Fork all consensus clients repos + Nethermind and prepare to distribute binaries with updated TTD and bellatrix epoch values

Distribute custom images

Consensus clients

All flags are taken from each client's flag here https://github.com/gnosischain/consensus-deployment-ansible/tree/master/denver/inventory/group_vars

lighthouse

source image: sigp/lighthouse. Reference https://github.com/dappnode/DAppNodePackage-lighthouse-gnosis/blob/master/beacon-chain/Dockerfile

beacon

lighthouse
--testnet-dir="/custom_config_data"
beacon
--boot-nodes="{{ bootnode_enrs | join(',') }}"

validator

lighthouse
--testnet-dir="/custom_config_data"
vc

teku

source image: consensys/teku. Reference https://github.com/dappnode/DAppNodePackage-teku-gnosis/blob/master/beacon-chain/Dockerfile

beacon

--network="/custom_config_data/config.yaml"
--initial-state="/custom_config_data/genesis.ssz"
--p2p-discovery-bootnodes="{{ bootnode_enrs | join(',') }}"

validator

--network="/custom_config_data/config.yaml"

nimbus

source image: statusim/nimbus-eth2. Reference https://github.com/dappnode/DAppNodePackage-nimbus-gnosis/blob/main/build/Dockerfile

beacon

beacon_node
--network="/custom_config_data"
--bootstrap-node="{{ bootnode_enrs | join(',') }}"

validator

Nimbus runs beacon and validator in the same image

prysm

Note this image is already wrapped by us, since it needs to be re-build.

beacon

--chain-config-file="/custom_config_data/config.yaml"
--genesis-state="/custom_config_data/genesis.ssz"
--bootstrap-node="{{ bootnode_enrs[0] }}"
--bootstrap-node="{{ bootnode_enrs[1] }}"

validator

--chain-config-file="/custom_config_data/config.yaml"

lodestar

source image: chainsafe/lodestar. No reference

beacon

beacon
--preset=gnosis
--paramsFile=/custom_config_data/config.yaml
--genesisStateFile=/custom_config_data/genesis.ssz
--network.discv5.bootEnrs="{{ bootnode_enrs[0] }}"

validator

validator
--preset=gnosis
--paramsFile=/custom_config_data/config.yaml

Execution client

nethermind

source image: nethermind/nethermind. Reference https://github.com/dappnode/DAppNodePackage-nethermind-xdai/blob/master/build/Dockerfile

--config=/networkdata/nethermind_config.cfg
--Init.ChainSpecPath="/networkdata/nethermind_genesis.json"

Add tx-fuzzer

Need to have some activity in the devnets to be closer to real networks

BlockReward not working on Chiado

  • I ran the script below against Denver network and it successfully minted 1 coin to beneficiary
  • The I ran the same script against Chiado network and the balance of beneficiary did not increase

Chiado and Denver have the exact same genesis except for the networkID. Same initial AuRa contracts and addresses. To double check, I run this test from posdao-test-setup against Chiado, and it also failed. https://github.com/NethermindEth/posdao-test-setup/blob/1617f815961d9d28bbb94c042363b2129f2e2dba/test/00_block_reward.js#L45

However the posdao-test-setup succeed in CI. There may be some difference in the deployment that triggers this bug?

  • @jmederosalvarado please investigate
  • As another sanity check, I'm wondering if in shadow-forks the bridge transactions are replayed in the fork too. There we could double check if minting is happening post-merge or not with mainnet deployment.
async function main() {
  const signers = await ethers.getSigners();
  const beneficiary = "0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
  const ownerSigner = signers[0];

  const blockReward = await ethers.getContractAt(
    "BlockRewardAuRaTest",
    BLOCK_REWARD_CONTRACT,
    ownerSigner
  );

  await waitTx(
    blockReward.setErcToNativeBridgesAllowed([ownerSigner.address]),
    "setErcToNativeBridgesAllowed"
  );

  await waitTx(
    blockReward.addExtraReceiver(BigInt(1e18), beneficiary),
    "addExtraReceiver"
  );

  while (true) {
    const [beneficiaryBalance, mintedTotally, headBlockNumber] =
      await Promise.all([
        ethers.provider.getBalance(beneficiary),
        blockReward.mintedTotally(),
        ethers.provider.getBlockNumber(),
      ]);
    console.log({
      beneficiaryBalance,
      mintedTotally,
      headBlockNumber,
    });
  }
}

gc-merge-devnet-0 deadlock Jun 26


Nethermind (user list) and Lighthouse (user list) devs can ssh to hosts (ip list here) ssh devops@<ip>. Same as Pari's deployments


  • gc-merge-devnet-0 was launched at Jun 25 2022 09:05:00 GMT+0000 #2. 8 validating clients, all lighthouse-nethermind, each has 100 keys, network of 800 validators total. The first two nodes are POSDAO validators.
  • TTD was reached at Jun 25 2022 10:22:10 GMT+0000 as expected and merge block finalized
  • At 22:51:00 GMT+0000 the chain started to fail to produce blocks
  • A few more blocks were produced in the next 30 minutes until the chain stalled indefinitely.

Some relevant metrics
Screenshot from 2022-06-27 14-21-54
Screenshot from 2022-06-27 14-23-46
Screenshot from 2022-06-27 14-25-36
Screenshot from 2022-06-27 14-40-28

Digging lighthouse logs of merge-devnet-0-lighthouse-nethermind-3 I found

Jun 26 22:51:13.349 DEBG Execution engine call failed            id: http://127.0.0.1:8560/, error: ServerMessage { code: -38002, message: "Inconsistent forkchoiceState - safe block hash. Request: ForkchoiceState: (HeadBlockHash: 0xa2e14ea188d4dc9b247b45367a4b45b45eabfe5ebbf0cd15eaba5ccc4902ff7b, SafeBlockHash: 0xa2e14ea188d4dc9b247b45367a4b45b45eabfe5ebbf0cd15eaba5ccc4902ff7b, FinalizedBlockHash: 0x528f0716d2431e2cf59c93ef2f9af7350dd409de4b40fdae617bef4b15f7d02d) PayloadAttributes: (Timestamp: 1656283875, PrevRandao: 0xb5aec702d8c74b392586b2c39acc51b5ebe0313e951429bd65f5fc1b1a1dda2a, SuggestedFeeRecipient: 0xf97e180c050e5ab072211ad2c213eb5aee4df134)" }, service: exec

Jun 26 22:51:13.353 WARN Error whilst processing payload status  error: Api { id: "http://127.0.0.1:8560/", error: ServerMessage { code: -38002, message: "Inconsistent forkchoiceState - safe block hash. Request: ForkchoiceState: (HeadBlockHash: 0xa2e14ea188d4dc9b247b45367a4b45b45eabfe5ebbf0cd15eaba5ccc4902ff7b, SafeBlockHash: 0xa2e14ea188d4dc9b247b45367a4b45b45eabfe5ebbf0cd15eaba5ccc4902ff7b, FinalizedBlockHash: 0x528f0716d2431e2cf59c93ef2f9af7350dd409de4b40fdae617bef4b15f7d02d) PayloadAttributes: (Timestamp: 1656283875, PrevRandao: 0xb5aec702d8c74b392586b2c39acc51b5ebe0313e951429bd65f5fc1b1a1dda2a, SuggestedFeeRecipient: 0xf97e180c050e5ab072211ad2c213eb5aee4df134)" } }, service: exec

Seems that Nethermind rejects forkchoice updates due to Inconsistent forkchoiceState - safe block hash.

Setup xDAI shadow-forks

Prepare playbooks to run shadow-forks in xDAI.

  • Do a test with +2 nodes and ensure transactions are getting re-played in the forked devnet
  • Add instructions to README on how to execute

Test checkpoint sync, errors in denver

Tested checkpoint sync in denver testnet and got this errors. Had to move quickly into restoring the network so could not test more. Someone must test checkpoint sync for networks with gnosis preset.

Prysm

with

--checkpoint-sync-url "http://164.92.135.98:4000"

got

time="2022-08-02 08:18:21" level=info msg="requesting http://164.92.135.98:4000/eth/v1/beacon/weak_subjectivity"
time="2022-08-02 08:18:21" level=info msg="falling back to generic checkpoint derivation, weak_subjectivity API not supported by server"
time="2022-08-02 08:18:21" level=info msg="requesting http://164.92.135.98:4000/eth/v1/node/version"
time="2022-08-02 08:18:21" level=info msg="requesting http://164.92.135.98:4000/eth/v2/debug/beacon/states/head"
time="2022-08-02 08:18:23" level=error msg="Error retrieving checkpoint origin state and block: error computing weak subjectivity epoch via head state inspection: error detecting chain config for beacon state: version=0x0200006f: version not found in fork version schedule for any known config" prefix=main

Teku

with

--initial-state=http://164.92.135.98:4000/eth/v2/debug/beacon/states/finalized

got

08:30:09.218 INFO  - Initializing storage
08:30:09.222 INFO  - Storage initialization complete
08:30:09.223 INFO  - Loading initial state from http://164.92.135.98:4000/eth/v2/debug/beacon/states/finalized
Teku failed to start: tech.pegasys.teku.infrastructure.ssz.sos.SszDeserializeException: First variable element offset doesn't match the end of fixed part

gnosis-withdrawal-devnet-4 testing tracker

While Hive tests are being developed, @filoozom and I will do manual testing on devnet-4. The list of items is not long so if properly documented should give confidence to clients of readiness.

Withdrawals

  • Send BLS credential changes to some portion of genesis validators
  • Assert partial withdrawals happening correctly
  • Send VoluntaryExit for some portion of genesis validators with BLS credentials
  • Assert full withdrawals happening correctly
  • Drain deposit contract of all funds (with stealFrom)
  • Assert backlog of withdrawals accumulates
    • Check that SBCDepositContract.numberOfFailedWithdrawals() > 0
    • Query some indexes of SBCDepositContract.failedWithdrawals(i) and assert withdraw data matches that of blocks
  • Refund deposit contract
  • Assert partial withdrawals are happening correctly again
  • Assert failedWithdrawals queue is being cleared (check that failedWithdrawalsPointer > 0and eventuallynumberOfFailedWithdrawals == failedWithdrawalsPointer`

EIP-3860
Limits the size of deployed contracts and makes deployments a bit more expensive

  • Attempt to deploy a contract above the limit and expect failure

CC: @filoozom

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.