This is a high level overview of how a network of Hubs is expected to work with a Simple Sync mechanism.
Simple Sync here refers to #106, i.e a bulk download of all data from another Hub.
Anatomy of a Hub
Hub is a collection of services that together perform the work of synchronizing and maintaining Farcaster Data.
For now the services we're interested in are:
- Engine: Hubs will have an Engine that can process and store FC messages
- JSON RPC Server: A RPC server that can share data about the Hub or what's stored in its Engine
- Libp2p Node: A Libp2p node that sends and receives
Message
s over the Gossipsub network. Messages received from a client are sent to the network, while Messages received from the network are replayed into the Engine.
Network Protocols and Transports
A quick overview of the network protocols available and what they'll be used for.
Libp2p & GossipSub
Hubs will use Libp2p to establish a pubsub mesh between each other.
This becomes the primary mechanism through which all Message
s that originate from Hub clients are propagated between Hubs.
Example - A newly created Cast from a user is submitted from the FC Client they use to their configured Hub, which will then publish that Message
to the network. Libp2p is then responsible for making sure that message is delivered to all Hubs.
JSON RPC
Hubs can request data from each other more explicitly using the JSON RPC.
Simple Sync will be implemented using a series of JSON RPC calls.
Running a Hub
Bootstrapping
Bootstrapping a Hub to an existing network requires 2 things.
- Simple Sync to set up the Engine from another Hub
a. This is the first thing a Hub does on boot up
b. The Engine is set up by replaying all the messages received from the other Hub
- A Libp2p node to receive new
Message
s from other Hubs on the network
a. While the Engine is being synced, Message
s received are buffered
b. Once Simple Sync is complete, the buffered Message
s are replayed into the Engine
From this point we expect the Hub to be in sync with the rest of the Hubs on the network.
Runtime
During normal operation, Hubs only rely on Libp2p to deliver new messages and keep them in sync.
Liveness
-
If a Hub process is restarted, the whole Bootstrapping process must be restarted as well.
This is not great and a huge limitation of this Sync mechanism.
-
If a Hub loses network connectivity entirely for:
a. A short duration (5s): GossipSub will republish messages to the Hub. This is configurable.
b. A long duration: We will need to rerun bootstrap.
Gaps
Divergence in state due to Network loss or GossipSub failure
This is the primary limitation of the current proposal. Hubs have no way to know that they're going out of sync from each other.
Some things we could do in the near term:
- To Gossip, we could add the total number of messages or total number of messages of this type (Casts). This would give Hubs some indication of ongoing divergence of state. After some amount of delta, we can trigger a full re-sync.
- We could periodically download entire sets from another Hub. (expensive)