Coder Social home page Coder Social logo

cortze / py-dht Goto Github PK

View Code? Open in Web Editor NEW
4.0 3.0 1.0 207 KB

Python module that can be used to simulate how a Kademlia based DHT could be integrated into Ethereum to help with the data-sharding schema

License: MIT License

Python 76.97% Shell 1.02% Jupyter Notebook 22.00%

py-dht's Introduction

py-dht ยท

small and straightforward representation of how a kademlia-based dht could be integrated into ethereum, particularly at das-research.

What can this dht do?

the current work on this simulation of a kademlia-based dht network offers:

  • DHTNewtork object that can: spawn dhtclients, serve as main source to initialize the routing table of the dhtclients, resolve connections between dhtclients, handle and keep track of the interactions between dhtclients. the parameters to configure the a dhtnetwork are:

    • networkid: in case we want to simulate different network at the same time
    • errorrate: to define the number connections that will fall into an error (we can understand it as the likelines o f an error in %)
    • delayrage: range between the slowest possible delay and the biggest one. a random delay will be selected every time a connection is stablished between 2 nodes (if no error is raised)

    the network offers the following functions:

    • parallel_clilist_initializer
    • init_with_random_peers initializes a network using a "blazingly fast" method, which can be optimized even more if a number of threads/processes is defined
    • add_new_node adds a new node to the local Network
    • connect_to_node returns the Connection obj between node A and B
    • bootstrap_node return the "best" nodes/dhtclis to compose the routing table for the given node
    • summary return the summary of the current status of the network (number of nodes, successful connections, failed ones, etc), will evolve over time
  • Connection interface that limits how two dhtclients interact with each other (like if it was a closed api/protocol). it offers the possibility to client a (client starting the connection) to ask client b (remote client), applying if specified the delay at the moment of stablishing the connection and per each interaction:

    • get_closest_nodes_to(hash) will return the k closest nodes to the given hash that client b has in it's routing table. note: it will also return the value if it's stored locally :)

    • store_segment(segment) will add the hash and the segment as a key-value pair in b's local storage

    • retrieve_segment(hash) will ask b to return the segment of the given hash if it has it

  • DHTClient as representation of a node in the simulated dht network. the dhtclient can be created using the following parameters:

    • nodeid: id of the node that hosts the dhtclient
    • network: referece to the network obj that where the dhtclient participates in
    • kbucketsize: k value, number of nodes per kbucket
    • a: number of concurrent node connections the client does while looking for a given key
    • b: target of nodes (number of nodes) returned when asking for a hash
    • steptostop: number of iterations without finding anyone closer to stop the lookup operation

    the client serves a list of endpoints such as:

    • bootstrap uses the network reference to find the right peers for the routing table
    • lookup_for_hash will try to look the value of the hash in the network, and the closest nodes to it
    • get_closest_nodes_to will return the closest nodes to a hash from the local routing table
    • provide_block_segment will lookup for the closest nodes in the network, and store the segment on them
    • store_segment will store locally a segment value using its hash as key
    • retrieve_segment will return the value of a hash if its locally, exception raised otherwise
  • RoutingTable and KBucket classes to store locally the local representation of the network for a given node

  • Hash and BitArray classes to represent a nodeid/blocksegment/generalobject

Dependencies

The source code runs mostly on plain Python libraries. However, to speed up the performance, the plain arrays and dicts were updated to classes from collections. Thus, I recomend to have an specific virtual environment to use the module.

To install the dependencies, do:

# python -m venv venv
# or
# python -m virtualenv venv
# source venv/bin/activate
(venv)$ pip install -r requirements.txt

Tests

The repo has a list of tests to ensure that no functionality is broken whenever a new feature is added. All the tests are triggered whenever GitHub records a Push or a PullRequest. However, there are locally runable using the ./launch_test.sh script

# the script will try to activate any `venv` located at the root of the directory
py-dht$ bash launch_tests.sh

Benchmarks

Running benchmarks is a bit trickier. First install the py-dht module as editable in the venv

py-dht$ pip install -e ./

after that, feel free to change the parameters in the benchmarks/launch_benchmarks.sh and run it like if it was a test:

# the script will try to activate any `venv` located at the root of the directory
py-dht$ cd benchmarks
py-dht/benchmarks$ bash launch_benchmarks.sh

Numbers and recomendations

From the experience of running tests and benchmarks on the repo, I can say that the optimizations on #8 and #9 were more than necesary.

Recomendations:

  • to simulate a network -> use the network.init_with_random_peers() function setting the processes parameters the initialization of the network is by far the process that takes the longer, as it has to compute the best routing tables for each of the spawned nodes. So, please benefit from the concurrency to reduce the duration (check this test as example)
  • At the moment using 20 cores of a ryzen 5900x I'm able to initialize a network of 10k dhtclients with k=20 in 28 secs

Latest numbers of the network becnhmark

# 03.08.2023 
py-dht$ python benchmarks/network.py -t opt-conc -o ./csvs -i 1 -k 20 -n 10000 --threads 20
-- benchmark: opt-conc_network_n_initialization --
rounds          : 1
failed rounds   : 0
prep time (s)   : 1.5974044799804688e-05
avg (s)         : 0.09420228004455566
median (s)      : 0.09420228004455566
p90_duration (s): 0.09420228004455566
-- benchmark: opt-conc_network_bootstrap_node --
rounds          : 1
failed rounds   : 0
prep time (s)   : 0.11602234840393066
avg (s)         : 0.6563236713409424
median (s)      : 0.6563236713409424
p90_duration (s): 0.6563236713409424
-- benchmark: opt-conc_network_fast_bootstrap_network --
rounds          : 1
failed rounds   : 0
prep time (s)   : 8.344650268554688e-06
avg (s)         : 174.6814157962799
median (s)      : 174.6814157962799
p90_duration (s): 174.6814157962799
-- benchmark: opt-conc_network_fast_threaded_bootstrap_network --
rounds          : 1
failed rounds   : 0
prep time (s)   : 0.0008330345153808594
avg (s)         : 28.239949226379395
median (s)      : 28.239949226379395
p90_duration (s): 28.239949226379395

At the moment (after applying the code optimizations in #8 and the conncurrency #9) the numbers look like this:

Number of nodes Default implementation 1st Optimization (monothread) 2nd Optimization (8 processes)
500 10 2.096 0.489
1000 39 5.39 1.02
1200 56 7.22 1.48
2000 157.36 14.49 2.83
5000 995.73 60.01 11.15
10000 (beyond 2 hours) 199.97 39.15
NOTE: all the measurements displayed in the table are expresed in seconds

img.png

Maintainer

Mikel Cortes-Goicoechea - @cortze

Contributing

feel free to dive in! change proposals, issues, and prs will be more than welcome.

Support

  • the work has been supported by codex
  • feel free to support this project through buy me a coffee; help me getting the ship of caffeine that I need ๐Ÿ˜Š.

License

mit, see license file

py-dht's People

Contributors

cortze avatar

Stargazers

 avatar YoKung avatar Leo avatar Tarun Mohandas Daryanani avatar

Watchers

Leo avatar  avatar  avatar

Forkers

ertanenver

py-dht's Issues

Add get K closest nodes to Hash to the DHTNetwork

Describtion

As things get complicated when we start using process concurrency in the DHT, It would be really nice if the DHTNetwork already had a "fast" way of returning the closest nodes to a particular hash.

This would help the test cases and allow aggregate tests in foreign repos when individual operations might want to be done in parallel (i.e., if someone needs to do 200k lookups, it will definitely want to do it in parallel).
This function will allow those repos to check if the result of the lookup is actually the correct one.

Slightly large error rates force lookup to finish sooner than expected

Description

With the current implementation after #14 , performing a DHT lookup with a low stepsToStop value and a relatively large failure rate makes the lookup finish earlier than desired.
The explanation for this event (check the following image) is that, as the connection failures do not report any closer nodes, the current lookup code takes them into account to increase the stepsToStop counter.

Of course, this is not a desired output, as a higher error rate can eventually generate X number of consecutive failed connections, leading the lookup to finish earlier than expected. The following picture shows the summary of some lookups which ended because the 3 concurrent dials reported a failure:

Aggregated Latency (ms) lookup results with Error Rate = 30%
image image

This is a behavior that we should correct.

Optimize the `BitArray` binary representation

Description

Due to the problems with the std binary representation of integers in Python, I'm walking around the BitArrays using a string to represent in binary a given number (ensuring the availability of the first upper zeros). In Python, the binary representation starts from the first 1 in the binary number, missing all the zeros before it.

Ideally, I should treat that string as an array of booleans, which should perform a bit better.

Aggregate different type of connection errors

Description

From my experience building the Armiarma Crawler and the CID-Hoarder, there is a wide range of connections that are rejected by remote nodes. However, the nature/cause of these errors is not always the same.

In the context of the Libp2p, there are "fast" errors like connection refused or stream resets, and "slow" errors like context deadline exceeded or timeouts.

For this reason, I should add two types of connection error delays to the simulator with different delays:

  • fast error (ms of latency, like the one that we have at the moment)
  • slow error (current timeouts are between 5 secs and 20 secs)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.