Coder Social home page Coder Social logo

thegarii's Introduction

thegarii

crates.io github build status

The polling service for Arweawve blocks

Getting Started

> cargo install thegarii
> thegarii -h
thegaril 0.0.3
[email protected]
env arguments for CLI

USAGE:
    thegarii [FLAGS] [OPTIONS]

FLAGS:
    -d, --debug      Activate debug mode
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -B, --batch-blocks <batch-blocks>    how many blocks polling at one time [default: 20]
    -b, --block-time <block-time>        time cost for producing a new block in arweave [default: 20000]
    -c, --confirms <confirms>            safe blocks against to reorg in polling [default: 20]
    -e, --endpoints <endpoints>...       client endpoints [default: https://arweave.net/]
    -p, --ptr-path <ptr-path>            block ptr file path
    -r, --retry <retry>                  retry times when failed on http requests [default: 10]
    -t, --timeout <timeout>              timeout of http requests [default: 120000]

Environments

KEY DEFAULT_VALUE DESCRIPTION
ENDPOINTS "https://arweave.net" for multiple endpoints, split them with ','
BATCH_BLOCKS 50 how many blocks batch at one time
CONFIRMS 20 irreversibility condition
PTR_PATH $APP_DATA/thegarii/ptr the file stores the block ptr for polling
retry 10 retry times when failed on http requests
timeout 120_000 timeout of http requests

Dev

Build the source code with cargo build --release.

To config the number of nodes to pull blocks from, define the env variable: ENDPOINTS, i.e. export ENDPOINTS=http://178.62.222.154:1984,http://localhost:1984. The default node is https://arweave.net/.

To start estimating the total ingestion time using the following command:

./target/release/thegarii poll -h

To compile, set env variables and run in one go, you can use:

ENDPOINTS=http://178.62.222.154:1984,http://localhost:1984 cargo run --release -- poll -h

thegarii's People

Contributors

clearloop avatar ivanceras avatar chamorin avatar dependabot[bot] avatar maoueh avatar willeslau avatar redoudou avatar

Stargazers

 avatar C H avatar

Watchers

 avatar James Cloos avatar Amer Ameen avatar Priom Chowdhury avatar Willem Olding avatar Adam Fuller avatar David Ansermino avatar  avatar

thegarii's Issues

Command get

Also, I think would be greate if we have a command to injest a specific block, for operational's sake I guess. Just in case we need to repump some historical blocks.

Originally posted by @willeslau in #31 (comment)

Create protobuf file

Create the Protobuf file that fits into this sample code

https://github.com/streamingfast/firehose-acme/blob/master/codec/consolereader.go

Add storage/cache for Arweave blocks

Issue summary

cache our Arweave blocks into a local database for the firehose extractor

  • Implement the Cache struct with RocksDb?(or any other better solutions)
  • writing data with multiple threads with file lock
  • reading data with multiple threads with readonly-mode

Other information and links

what should we do with the forked blocks?

Leave the garii open for several hours and it encountered some error in field type in blocks

Describe the bug

  • This log just came up after 3 hours of running. This could just be caused from the temporary internet connection loss.
2022-03-21T17:07:57Z WARN  thegarii::service] polling service is down error sending request for url (https://arweave.net/block/height/31700): operation timed out, restarting...
[2022-03-21T17:07:57Z INFO  thegarii::service::polling] fetching blocks 31700..31750/897031...
[2022-03-21T17:08:07Z WARN  thegarii::service] polling service is down error sending request for url (https://arweave.net/block/height/31700): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution, restarting...

[2022-03-21T19:05:08Z INFO  thegarii::service::polling] fetching blocks 110300..110350/897031...
[2022-03-21T19:05:21Z WARN  thegarii::service] polling service is down error decoding response body: invalid type: integer `16996080`, expected a string at line 1 column 646, restarting...
[2022-03-21T19:05:21Z INFO  thegarii::service::polling] fetching blocks 110300..110350/897031...

Expected Behavior

Current Behavior

Possible Solution

To Reproduce

Steps to reproduce the behavior:

Log output

Log Output
paste log output...

Specification

  • rustc version:
  • pint version:
  • commit tag:
  • commit hash:
  • operating system:
  • additional links:

Build the CLI tool

Issue summary

https://crates.io/crates/structopt

thegarii v0.1

COMMANDS

start - start services ( including polling, checking, gRPC service )
check - check and fetching missed blocks
count - count the sync status
export - export data from rocksdb 
import - import data from rocksdb
...


FLAGS

--debug - enable debug mode

Other information and links

host latest block in database

Issue summary

The Request of firehose service have a field irreversibility_condition which potentially set the limits for confirms, in this case, we need to host the latest block and check it for each requests

Other information and links

Generate protobuf types in BUILD_DIR

Issue summary

we don't need to write protobuf types to our work directory, instead, we can simply output them to the BUILD_DIR then use include! macro to include them

Other information and links

Dockerfile for arweave node has network issues.

Running the docker image

Protocol 'inet_tcp': register/listen error: econnrefused

Full log:

Launching Erlang Virtual Machine...
Exec: /application/arweave/erts-12.3/bin/erlexec -noinput +Bd -boot /application/arweave/releases/2.5.1.0/start -mode embedded -boot_var SYSTEM_LIB_DIR /application/arweave/lib -config /application/arweave/releases/2.5.1.0/sys.config -args_file /application/arweave/releases/2.5.1.0/vm.args -- foreground +Ktrue +A20 +SDio20 +sbwtvery_long +sbwtdcpuvery_long +sbwtdiovery_long +swtvery_low +swtdcpuvery_low +swtdiovery_low +Bi -run ar main mine mining_addr nKn0ZQET1VcpW6_OdpVOP-Pm6b-_BwagHTg3BtByVkA peer 188.166.200.45 peer 188.166.192.169 peer 163.47.11.64 peer 139.59.51.59 peer 138.197.232.192 peer 178.62.222.154 peer 51.75.206.225 peer 90.70.52.14
Root: /application/arweave
/application/arweave
Protocol 'inet_tcp': register/listen error: econnrefused
Arweave Heartbeat: The Arweave server has terminated. It will restart in 15 seconds.
Arweave Heartbeat: If you would like to avoid this, press control+c to kill the server.

Strangely, this only happens when running it inside of docker.
Running the same script inside of a real cloud instance have no problem.

Implementation details

Testing details

Acceptance Criteria

Pulling from Arweave api with multiple threads

Issue summary

For building our block storage, we need to pull data from the Arweave API as fast as we can, the limit of this pulling solution mb the network speed(timeout for the requests) but not the speed of writing data to database.

What we can do for optimizing this pulling process is doing this with async rust, but we need to pay attention to how many threads can work at the same time in our program.

Other information and links

Only trace current block in polling service

Issue summary

since Arweave node is syncing from latest to genesis, the current polling logic is not suitable for nodes which has not been synced yet, besides, since we thegarii get fully synced, we don't need to batch 50 blocks on each new block

after this refactor, the polling service will trace the latest block and the checking service will sync blocks from latest to genesis

Other information and links

Make install script for arweave node fail safe, and can be run multiple times.

  • Currently the setup for arweave node is done by bash script in the host cloud instance instead of docker (as it has issue with ports when used inside of docker).

Implementation details

  • The arweave compilation code should reside in proper linux directory suc as /usr/src
  • The arweave executable should reside in proper linux directory such as in /opt.
  • The system configuration files that are modified should not add a new entry when the install script is run multiple times, intentionally/unintentionally.
    • /etc/sysctl.conf
    • /etc/systemd/user.conf
    • /etc/systemd/system.conf
      Need to use sed command to replace the configuration if it already exist, and append only when there is no entry for that configuration yet.

Testing details

We can request the devops team to destroy and recreate the cloud instance.
Run the install script and the arweave node should be live right after.

Acceptance Criteria

Create services for fetching arweave blocks

Issue summary

  • pulling service
    • pull arweave block from 0 ~ lastest
    • pull block by height with atomic pointer
    • batch write blocks into database
  • checking service
    • trigger checking service internally
    • re-pull missed blocks
  • grpc service
    • adapt to the graph's firehose API

Other information and links

Add command check

Issue summary

Add command check which will fetch the missing blocks to database

Other information and links

Command status

Issue summary

status command, show the status of the current database

$ thegarii status

current: <height>
syncing: <height>
...

Other information and links

Test scenario

  • What is the expected time to ingest the historical chain data from a synced node and what is the bottleneck?
  • What are the performance benefits to be gained by taking advantage of the Arweave nodes multi-process web server?
  • Is a polling implementation fast enough to process a block and its transactions before the next block? Does this extend to processing data payloads as well?

Writing test scenario

Average time to retrieve a block and all of its transactions
Averaged across many blocks since the number of transactions varies greatly
Same as above but with calls parallelized and the average clock time recorded as a function of number of calling processes
The expectation is that the Erlang Arweave node web server is designed to handle many requests at once and overall block retrieval time could be reduced this way.

Prepare Test for presenting to client.

Implement extractor

take all the Arweave data , convert the data into firehorse using protobuf then encode and send it to firehose endpoint.

  • conversion from Arweave data type to Firehose block
  • integration block into firehose API

We can explore this dummy blockchain that seems to use firehose

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.