Comments (6)
Okay, it's decided then! I also like its simplicity and determinism. I am not fully convinced of 50-50. But in order to reason strongly about that, we would need to perform some experimentation. So, for now, I will implement this approach.
from casm.
@aratz-lasa Thanks for the detailed writeup and for the link to the paper!
General Remarks
Just to double-check my understanding, the proposed protocol corresponds to what the paper would call (rand, head, pushpull)
, right? Assuming so, what are your thoughts about this point in the general discussion?
The only scenario when head view selection is not desirable is temporary network partitioning. In that case, with head view selection all partitions will forget about each other very quickly and so quick self-repair becomes a disadvantage. In practical applications the slow and quick self-healing mechanisms should be combined.
I wholeheartedly agree that a head
-based view-selection is desirable for its rapid convergence, but I am somewhat concerned about its propensity to worsen partitions. Do you have any thoughts on what a mixed approach might look like? Intuitively, it seems like only a little bit of randomness would be needed, so perhaps we could define a view selection rule that selects the n-1
neighbors from the list head, and then appends an an n
th random neighbor?
Let's not worry about this immediately, but definitely make a note of this somewhere (ideally: a code comment + new issue) and return to it soon.
Incidentally, it would be nice to reproduce part of Jacobsen et al.'s analyses in Testground, so that we can select the optimal balance between rapid convergence and rapid self-healing empirically.
Answers to your Queries
I suggest creating a new struct for representing peers that contains two fields peer.AddrInfo and Hop. Then, these structs could be stored in one of the two following places: [...]
I'm in favor of Option 1, i.e. storing this struct in neighborhood.vtx
.
Note that peer.PeerRecord
has a Seq
fields that seems appropriate for this kind of use, so perhaps we can just use PeerRecord
directly? We're using it to transmit addressing information anyway, so we may as well (and I don't think the warning about Seq
actually affects us).
from casm.
Just to double-check my understanding, the proposed protocol corresponds to what the paper would call
(rand, head, pushpull)
, right? Assuming so, what are your thoughts about this point in the general discussion?
Yes, indeed. My thoughts are the following:
- Communication symmetry:
push
is unstable and converges slowly, whilepull
converges into a star topology.pushpull
is superior regarding convergence and stability. - Peer selection:
head
leads to severe clustering, whiletail
andrand
are pretty similar. Thetail
selection introduces more randomness in exchange for a slower convergence. Therefore, I suggest to userand
, as it provides a faster self-healing. - View selection:
tail
cannot handle dynamism when new nodes join.head
repairs the overlay much faster and the degree distribution is more stable thanrand
selection. However, therand
selection provides a lower clustering coefficient.
I wholeheartedly agree that a
head
-based view-selection is desirable for its rapid convergence, but I am somewhat concerned about its propensity to worsen partitions. Do you have any thoughts on what a mixed approach might look like? Intuitively, it seems like only a little bit of randomness would be needed, so perhaps we could define a view selection rule that selects then-1
neighbors from the list head, and then appends an ann
th random neighbor?
In the article they say that partitioning is not due to head
view selection, but because of push
communication pattern. This is what they say: "The partitioning of the push version of the protocols is due to the fact that it is only the first, central node that can distribute new links to all new members.". However, I do see benefits for using rand view selection: (1) provides lower clustering coefficient and (2) during temporary partitions rand
's lower convergence is desired, otherwise all partitions will forget about each other very quickly.
Note that
peer.PeerRecord
has aSeq
fields that seems appropriate for this kind of use, so perhaps we can just usePeerRecord
directly? We're using it to transmit addressing information anyway, so we may as well (and I don't think the warning aboutSeq
actually affects us).
That was my first thought. Initially, I have even implemented PeerRecord
in the PR draft #2 . However, when PeerRecord's source code, says Seq
is expected to be used as a timestamp (calling peer.TimestampSeq())? Maybe I am wrong and there is no any real strong desire to enforce a timestamping. In that case, it would be the ideal fit to use the Seq
field as the Hop
counter.
from casm.
In the article they say that partitioning is not due to head view selection, but because of push communication pattern. This is what they say: "The partitioning of the push version of the protocols is due to the fact that it is only the first, central node that can distribute new links to all new members.".
It seems like there are actually two separate issues.
- In View Propagation:
push
causes partitions (for the reason you mentioned). - In View Selection:
head
aggravates partitions because we "forget" peers from the other partition.
However, I do see benefits for using rand view selection: (1) provides lower clustering coefficient and (2) during temporary partitions rand's lower convergence is desired, otherwise all partitions will forget about each other very quickly.
My concern with rand
is this bit:
Also, head view selection repairs the overlay exponentially fast whereas random view selection can at best achieve linear speed, which can hardly be considered scalable
The paper suggests using some (undefined) mix of both.
I see several possiblilties:
- Merge the views using 50% head nodes and 50% random nodes.
- Have
/casm/net/<version>/<ns>/gossip/rand
and/casm/net/<version>/<ns>/gossip/head
subprotocols, and have the initiating node pick one of the two using some (weighted?) random function.
Though perhaps these options are all equivalent... 🤔
A third possibility would allow us to set the proportion of rand/head more or less continuously. As with option 2, we define subprotocols .../gossip/rand
and .../gossip/head
. During peer exchange, each peer performs xor(id(self), id(neighbor))
. If the result is lower than some arbitrary threshold, select rand
, else select head
.
However, when PeerRecord's source code, says Seq is expected to be used as a timestamp (calling peer.TimestampSeq())?
My reading is that you can use Seq
as a timestamp, but it's not required.
The issue they warn about pertains to adding peer addresses to the local Peerstore
via the CertifedAddrBook.ConsumePeerRecord
, which will nop if the sequence number is smaller than the one presently stored (as you said).
Our Join
method establishes a connection using peer.AddrInfo
, so this isn't a concern for us. I say go for it! 🙂
from casm.
A third possibility would allow us to set the proportion of rand/head more or less continuously. As with option 2, we define subprotocols
.../gossip/rand
and.../gossip/head
. During peer exchange, each peer performsxor(id(self), id(neighbor))
. If the result is lower than some arbitrary threshold, selectrand
, else selecthead
.
I think a dynamic solution is great. However, I don't see why we would use the xor distance. Unlike in Kademlia DHT, in unstructured gossiping peers are not connected based on their xor distance. So, xor distance would be a rather random method.
Our
Join
method establishes a connection usingpeer.AddrInfo
, so this isn't a concern for us. I say go for it!
Nice, I will work on that.
from casm.
I think a dynamic solution is great. However, I don't see why we would use the xor distance. Unlike in Kademlia DHT, in unstructured gossiping peers are not connected based on their xor distance. So, xor distance would be a rather random method.
The idea was simply to randomize the decision, but I suppose it's easier to just call rand()
😄
The more I think about this, the more I'm of the opinion that we should use an approach that mixes strategies 1&2 from my previous comment. Basically, we would have the /rand
and /head
subprotocols and a node would select a protocol by comparing its ID to its neighbors, as such:
if this.id < other.id:
do_rand_exchange()
else:
do_head_exchange()
Although we lose the ability to fine-tune the proportion of rand
to head
, we gain a couple of nice properties:
- It's simple.
- It guarantees a 50-50 split across the whole cluster, always.
- It makes the interaction between any pair of nodes deterministic, which is helpful in debugging and analysis.
- The implementation of sub-protocols makes the code more explicit and self-documenting.
Besides, I have a feeling that 50-50 is good enough. Intuitively, convergence should still be exponentially fast (though we should test this).
from casm.
Related Issues (20)
- Investigate error when exceeding 30 nodes in testground HOT 1
- Investigate traffic shaping in Testground HOT 1
- Verify context usage in fx.Lifecycle HOT 1
- Add network mapping primitives
- TestRoutingTable_concurrent fails intermittently HOT 1
- Consider bootstrap record with separate signature domain. HOT 2
- Data race in PeX HOT 1
- Silence EOF errors from boot socket when restarting. HOT 1
- Reduce allocations when updating routing table
- Block in Vat.Embargo until host unregisters all stream handlers
- Add support for LZ4-compressed transport to casm.Vat
- Suppress `context cancelled` debug logs in Crawler
- Panic in BenchmarkIterator HOT 1
- Remove go.uber.org/atomic HOT 1
- TestSampler/Abort is flaky
- Switch to prometheus-based metrics
- Update QUIC multiaddrs
- Fix ordering of records when using LowerBound HOT 1
- Set multicast interface based on GOOS
- Add listen addr to casm discover template
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from casm.