Coder Social home page Coder Social logo

Comments (9)

sharth-bft avatar sharth-bft commented on August 21, 2024

I'm a college student studying the protocol that concord-bft is based, SBFT. I hit a dead end in trying to run this implementation on a cluster of machines, with this issue I would like to know if someone can provide me some insight in what I'm doing wrong.

I have 5 machines running a Linux distribution on an isolated network, where there is a link between every machine. Concord-bft was build with the flag cmake -DBUILD_COMM_TCP_PLAIN=TRUE ..
Currently testing the simpleTest program, where I used the program GenerateConcordKeys to initialize the replicas according to my needs:
./build/tools/GenerateConcordKeys -n 4 -f 1 -o private_replica_

In each machine I have the following configuration file located on the default directory (scripts/sample_config.txt):
`
replicas_config:

  • 192.168.2.21:3410
  • 192.168.2.22:3420
  • 192.168.2.23:3430
  • 192.168.2.24:3440
    clients_config: 192.168.2.25:4444
    `

This simulates a FastPath setup where there is 4 machines running the server and 1 machine running the client (n = 4, f = 1, c = 0).
The replicas are setup with the following commands from the base directory:
./build/tests/simpleTest/server -id 0..N -cf scripts/sample_config.txt
And the client command is:
./build/tests/simpleTest/client -id 4 -i 1 -cf scripts/sample_config.txt

I assume that all replicas can communicate with each other and are ready to receive client's requests (replicas exchange ReplicaStatusMsg messages), but when I start the client program the replicas produce the following output and the protocol can't proceed.

Replica 0:

Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 1
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 2
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 3
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Primary initiates slow path for seqNum=1 (currTime=1240247473 timeOfPartProof=1240017635
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 tries to request missing data for seqNumber 1
Node 0 sends ReqMissingDataMsg to 1 - seqNumber 1 , flags=1B0
Node 0 sends ReqMissingDataMsg to 2 - seqNumber 1 , flags=1B0
Node 0 sends ReqMissingDataMsg to 3 - seqNumber 1 , flags=1B0
Node 0 received ReqMissingDataMsg message from Node 3 - seqNumber 1 , flags=1E0
Node 0 received ReqMissingDataMsg message from Node 2 - seqNumber 1 , flags=1E0
Node 0 received ReqMissingDataMsg message from Node 1 - seqNumber 1 , flags=1E0

Replica 1:

Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Sending ClientRequestMsg to current primary
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 1 received StartSlowCommitMsg for seqNumber 1
Node 1 starts slow path for seqNumber 1
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 1 received ReqMissingDataMsg message from Node 0 - seqNumber 1 , flags=1B0
Node 1 received ReqMissingDataMsg message from Node 3 - seqNumber 1 , flags=1E0
Node 1 received ReqMissingDataMsg message from Node 2 - seqNumber 1 , flags=1E0
Node 1 tries to request missing data for seqNumber 1
Node 1 sends ReqMissingDataMsg to 0 - seqNumber 1 , flags=1E0
Node 1 sends ReqMissingDataMsg to 2 - seqNumber 1 , flags=1E0
Node 1 sends ReqMissingDataMsg to 3 - seqNumber 1 , flags=1E0

Replica 2:

Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Sending ClientRequestMsg to current primary
Failed to create FullProof for seqNumber 1
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 2 received StartSlowCommitMsg for seqNumber 1
Node 2 starts slow path for seqNumber 1
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 2 received ReqMissingDataMsg message from Node 0 - seqNumber 1 , flags=1B0
Node 2 tries to request missing data for seqNumber 1
Node 2 sends ReqMissingDataMsg to 0 - seqNumber 1 , flags=1E0
Node 2 sends ReqMissingDataMsg to 1 - seqNumber 1 , flags=1E0
Node 2 sends ReqMissingDataMsg to 3 - seqNumber 1 , flags=1E0
Node 2 received ReqMissingDataMsg message from Node 3 - seqNumber 1 , flags=1E0
Node 2 received ReqMissingDataMsg message from Node 1 - seqNumber 1 , flags=1E0

Replica 3:

Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Sending ClientRequestMsg to current primary
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 3 received StartSlowCommitMsg for seqNumber 1
Node 3 starts slow path for seqNumber 1
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 3 received ReqMissingDataMsg message from Node 0 - seqNumber 1 , flags=1B0
Node 3 tries to request missing data for seqNumber 1
Node 3 sends ReqMissingDataMsg to 0 - seqNumber 1 , flags=1E0
Node 3 sends ReqMissingDataMsg to 1 - seqNumber 1 , flags=1E0
Node 3 sends ReqMissingDataMsg to 2 - seqNumber 1 , flags=1E0
Node 3 received ReqMissingDataMsg message from Node 2 - seqNumber 1 , flags=1E0
Node 3 received ReqMissingDataMsg message from Node 1 - seqNumber 1 , flags=1E0

Thank you for your help.

from concord-bft.

sharth-bft avatar sharth-bft commented on August 21, 2024

More detailed logs:
Logs.zip

from concord-bft.

salieri11 avatar salieri11 commented on August 21, 2024

Hi @sharth-bft can you please build the library with UDP? (by running cmake ..)? I think its related to the communication module, lets try to isolate it

from concord-bft.

sharth-bft avatar sharth-bft commented on August 21, 2024

Hi @salieri11, thank you for your answer and sorry from the comment from another github account.
It provides the same result, from my understanding it looks like the client's request may be incomplete in a certain way that the request can't be processed n the fast path so the fall-back mode is activated but this mode can't proceed either, then the replicas loop asking each other if they have the request with a specific id (in this case only one requests is send). I included a tcpdump capture in the client's machine to ensure that the request is sent to every replica as described in the paper.
Log.zip

from concord-bft.

salieri11 avatar salieri11 commented on August 21, 2024

Are you running the latest master? Can you please verify?

from concord-bft.

sharth-bft avatar sharth-bft commented on August 21, 2024

I was running a relatively recent version, but after using the latest the problem remains the same. I'm sorry for this burden. Any other clue that can lead me to the right direction?

from concord-bft.

salieri11 avatar salieri11 commented on August 21, 2024

Are you running modified version of the code? With your changes? Or the original code?

from concord-bft.

sharth-bft avatar sharth-bft commented on August 21, 2024

I am running the original version the changes that I made was to fill the configuration file according to my needs (specified on the first issue)

from concord-bft.

salieri11 avatar salieri11 commented on August 21, 2024

Lets chat on slack, we have dedicated channel on concord-bft. You can join here

from concord-bft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.