Comments (9)
I'm a college student studying the protocol that concord-bft is based, SBFT. I hit a dead end in trying to run this implementation on a cluster of machines, with this issue I would like to know if someone can provide me some insight in what I'm doing wrong.
I have 5 machines running a Linux distribution on an isolated network, where there is a link between every machine. Concord-bft was build with the flag cmake -DBUILD_COMM_TCP_PLAIN=TRUE ..
Currently testing the simpleTest program, where I used the program GenerateConcordKeys to initialize the replicas according to my needs:
./build/tools/GenerateConcordKeys -n 4 -f 1 -o private_replica_
In each machine I have the following configuration file located on the default directory (scripts/sample_config.txt):
`
replicas_config:
- 192.168.2.21:3410
- 192.168.2.22:3420
- 192.168.2.23:3430
- 192.168.2.24:3440
clients_config: 192.168.2.25:4444
`
This simulates a FastPath setup where there is 4 machines running the server and 1 machine running the client (n = 4, f = 1, c = 0).
The replicas are setup with the following commands from the base directory:
./build/tests/simpleTest/server -id 0..N -cf scripts/sample_config.txt
And the client command is:
./build/tests/simpleTest/client -id 4 -i 1 -cf scripts/sample_config.txt
I assume that all replicas can communicate with each other and are ready to receive client's requests (replicas exchange ReplicaStatusMsg messages), but when I start the client program the replicas produce the following output and the protocol can't proceed.
Replica 0:
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 1
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 2
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 3
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Primary initiates slow path for seqNum=1 (currTime=1240247473 timeOfPartProof=1240017635
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because: request is old, OR primary is current working on a request from the same client, OR queue contains too many requests
Node 0 tries to request missing data for seqNumber 1
Node 0 sends ReqMissingDataMsg to 1 - seqNumber 1 , flags=1B0
Node 0 sends ReqMissingDataMsg to 2 - seqNumber 1 , flags=1B0
Node 0 sends ReqMissingDataMsg to 3 - seqNumber 1 , flags=1B0
Node 0 received ReqMissingDataMsg message from Node 3 - seqNumber 1 , flags=1E0
Node 0 received ReqMissingDataMsg message from Node 2 - seqNumber 1 , flags=1E0
Node 0 received ReqMissingDataMsg message from Node 1 - seqNumber 1 , flags=1E0
Replica 1:
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Sending ClientRequestMsg to current primary
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 1 received StartSlowCommitMsg for seqNumber 1
Node 1 starts slow path for seqNumber 1
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 1 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 1 received ReqMissingDataMsg message from Node 0 - seqNumber 1 , flags=1B0
Node 1 received ReqMissingDataMsg message from Node 3 - seqNumber 1 , flags=1E0
Node 1 received ReqMissingDataMsg message from Node 2 - seqNumber 1 , flags=1E0
Node 1 tries to request missing data for seqNumber 1
Node 1 sends ReqMissingDataMsg to 0 - seqNumber 1 , flags=1E0
Node 1 sends ReqMissingDataMsg to 2 - seqNumber 1 , flags=1E0
Node 1 sends ReqMissingDataMsg to 3 - seqNumber 1 , flags=1E0
Replica 2:
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Sending ClientRequestMsg to current primary
Failed to create FullProof for seqNumber 1
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 2 received StartSlowCommitMsg for seqNumber 1
Node 2 starts slow path for seqNumber 1
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 2 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 2 received ReqMissingDataMsg message from Node 0 - seqNumber 1 , flags=1B0
Node 2 tries to request missing data for seqNumber 1
Node 2 sends ReqMissingDataMsg to 0 - seqNumber 1 , flags=1E0
Node 2 sends ReqMissingDataMsg to 1 - seqNumber 1 , flags=1E0
Node 2 sends ReqMissingDataMsg to 3 - seqNumber 1 , flags=1E0
Node 2 received ReqMissingDataMsg message from Node 3 - seqNumber 1 , flags=1E0
Node 2 received ReqMissingDataMsg message from Node 1 - seqNumber 1 , flags=1E0
Replica 3:
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
Sending ClientRequestMsg to current primary
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 3 received StartSlowCommitMsg for seqNumber 1
Node 3 starts slow path for seqNumber 1
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 3 received ClientRequestMsg (clientId=4 reqSeqNum=6613786833941692416, readOnly=0) from Node 4
ClientRequestMsg is ignored because request is old or replica has another pending request from the same client
Node 3 received ReqMissingDataMsg message from Node 0 - seqNumber 1 , flags=1B0
Node 3 tries to request missing data for seqNumber 1
Node 3 sends ReqMissingDataMsg to 0 - seqNumber 1 , flags=1E0
Node 3 sends ReqMissingDataMsg to 1 - seqNumber 1 , flags=1E0
Node 3 sends ReqMissingDataMsg to 2 - seqNumber 1 , flags=1E0
Node 3 received ReqMissingDataMsg message from Node 2 - seqNumber 1 , flags=1E0
Node 3 received ReqMissingDataMsg message from Node 1 - seqNumber 1 , flags=1E0
Thank you for your help.
from concord-bft.
More detailed logs:
Logs.zip
from concord-bft.
Hi @sharth-bft can you please build the library with UDP? (by running cmake ..)? I think its related to the communication module, lets try to isolate it
from concord-bft.
Hi @salieri11, thank you for your answer and sorry from the comment from another github account.
It provides the same result, from my understanding it looks like the client's request may be incomplete in a certain way that the request can't be processed n the fast path so the fall-back mode is activated but this mode can't proceed either, then the replicas loop asking each other if they have the request with a specific id (in this case only one requests is send). I included a tcpdump capture in the client's machine to ensure that the request is sent to every replica as described in the paper.
Log.zip
from concord-bft.
Are you running the latest master? Can you please verify?
from concord-bft.
I was running a relatively recent version, but after using the latest the problem remains the same. I'm sorry for this burden. Any other clue that can lead me to the right direction?
from concord-bft.
Are you running modified version of the code? With your changes? Or the original code?
from concord-bft.
I am running the original version the changes that I made was to fill the configuration file according to my needs (specified on the first issue)
from concord-bft.
Lets chat on slack, we have dedicated channel on concord-bft. You can join here
from concord-bft.
Related Issues (20)
- Encountered TooSlowError in pyclient_tests_udp as part of CI Checks HOT 5
- ‘ConcordAssert’ was not declared HOT 3
- Stable branch HOT 4
- Can't build the latest master branch HOT 2
- simpleTest does not work as described in the documentation HOT 2
- Docker startup problem HOT 1
- Minimal toy example for concord-bft HOT 2
- State transfer does not complete when crashing primary replica
- Fix README for bftengine/tests directory HOT 1
- Add a README.md to kvbc directory HOT 1
- SkvbcPersistenceTest fails intermittently (in CI & locally) HOT 11
- Unclear update of the code that loads checkpoints HOT 8
- Clarify what "Project Concord" is
- Is there any chance that this project becomes more active? HOT 6
- [TASK] Clarify some Things (re #426) HOT 8
- [PROJECT] Add Milestones, remove "Projects" and "Wiki" HOT 1
- Memory leak at PartialProofsSet::tryToCreateFullProof HOT 2
- [SKVBC TesterReplica] Failure to parse a correct configuration file HOT 1
- 1 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from concord-bft.