Coder Social home page Coder Social logo

gossip's Introduction

This project has been moved to the Apache Software foundation

https://github.com/apache/incubator-gossip [Apache site] (http://gossip.incubator.apache.org)

Gossip Build status

Gossip protocol is a method for a group of nodes to discover and check the liveliness of a cluster. More information can be found at http://en.wikipedia.org/wiki/Gossip_protocol.

The original implementation was forked from https://code.google.com/p/java-gossip/. Several bug fixes and changes have already been added.

Usage

To gossip you need one or more seed nodes. Seed is just a list of places to initially connect to.

  GossipSettings settings = new GossipSettings();
  int seedNodes = 3;
  List<GossipMember> startupMembers = new ArrayList<>();
  for (int i = 1; i < seedNodes+1; ++i) {
    startupMembers.add(new RemoteGossipMember("127.0.0." + i, 2000, i + ""));
  }

Here we start five gossip processes and check that they discover each other. (Normally these are on different hosts but here we give each process a distinct local ip.

  List<GossipService> clients = new ArrayList<>();
  int clusterMembers = 5;
  for (int i = 1; i < clusterMembers+1; ++i) {
    GossipService gossipService = new GossipService("127.0.0." + i, 2000, i + "",
      LogLevel.DEBUG, startupMembers, settings, null);
    clients.add(gossipService);
    gossipService.start();
  }

Later we can check that the nodes discover each other

  Thread.sleep(10000);
  for (int i = 0; i < clusterMembers; ++i) {
    Assert.assertEquals(4, clients.get(i).get_gossipManager().getMemberList().size());
  }

Usage with Settings File

For a very simple client setup with a settings file you first need a JSON file such as:

[{
  "id":"419af818-0114-4c7b-8fdb-952915335ce4",
  "port":50001,
  "gossip_interval":1000,
  "cleanup_interval":10000,
  "members":[
    {"host":"127.0.0.1", "port":50000}
  ]
}]

where:

  • id - is a unique id for this node (you can use any string, but above we use a UUID)
  • port - the port to use on the default adapter on the node's machine
  • gossip_interval - how often (in milliseconds) to gossip list of members to other node(s)
  • cleanup_interval - when to remove 'dead' nodes (in milliseconds)
  • members - initial seed nodes

Then starting a local node is as simple as:

GossipService gossipService = new GossipService(
  StartupSettings.fromJSONFile( "node_settings.json" )
);
gossipService.start();

And then when all is done, shutdown with:

gossipService.shutdown();

Event Listener

The status can be polled using the getters that return immutable lists.

   List<LocalGossipMember> getMemberList()
   public List<LocalGossipMember> getDeadList()

These can be accessed from the GossipManager on your GossipService, e.g: gossipService.get_gossipManager().getMemberList();

Users can also attach an event listener:

  GossipService gossipService = new GossipService("127.0.0." + i, 2000, i + "", LogLevel.DEBUG,
          startupMembers, settings,
          new GossipListener(){
    @Override
    public void gossipEvent(GossipMember member, GossipState state) {
      System.out.println(member+" "+ state);
    }
  });

Maven

You can get this software from maven central.

  <dependency>
       <groupId>io.teknek</groupId>
      <artifactId>gossip</artifactId>
      <version>${pick_the_latest_version}</version>
  </dependency>

gossip's People

Contributors

edwardcapriolo avatar irstevenson avatar ptgoetz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gossip's Issues

Dead member detection

I trying to tack down an issue with this. It seems when the services start they are able to see the remote peers. But then after a couple of seconds they stop seeing the, It seems that the main listener might be dead. Here is an example from my log. Any idea what might be causing this will help I'll do some tracing as well. However the trend is after a couple of seconds.

NOTE: I'm running this within Spring Boot.

We have 1 members that are live: [Member [address=02.edu:5555, id=-944842516, heartbeat=0]]
2015-12-06 17:58:29.258  INFO 21947 --- [        Cluster] o.m.cs565.dccs.cluster.ClusterManager    : We have 1 members that are live: [Member [address=02.edu:5555, id=-944842516, heartbeat=0]]
2015-12-06 17:58:29.294  INFO 21947 --- [        Timer-0] com.google.code.gossip.GossipService     : Dead member detected: Member [address=02.edu:5555, id=-944842516, heartbeat=0]
2015-12-06 17:58:29.295  INFO 21947 --- [        Timer-0] o.m.cs565.dccs.cluster.ClusterManager    : Gossip Event Member [address=02.edu:5555, id=-944842516, heartbeat=0], state [DOWN]

Dead members coming alive again are soon wrongy recognized as dead again

Hey, when I ran tests with multiple clients on my local machine I discovered this behavior:

  • I shut down a client and wait until every other client knows about it
  • I start the client again (heartbeat starts at 0)
  • the new client is discovered again but after one or two iterations removed again for no obvious reason.

Did someone observe a similar behavior and has a fix to it?

Gossip custom message data

We can add a field to the message format that can be used to gossip custom data. We can have an extra field that can be a hashmap. Users can serialize anything into the map.

Heartbeat is always =0

I have the following initialization code where I specify a timeout of 20. However, when I look at the logs the heartbeat always stays at 0 and it never continue to check if the other nodes come back.

Here is the log

 Started Application in 2.735 seconds (JVM running for 3.108)
2015-10-11 14:43:04.667  INFO 69599 --- [        Timer-0] com.google.code.gossip.GossipService     : Dead member detected: Member [address=10.0.1.156:2000, id=10.0.1.156-2000, heartbeat=0]
2015-10-11 14:43:04.667  INFO 69599 --- [        Timer-1] com.google.code.gossip.GossipService     : Dead member detected: Member [address=10.0.1.9:2000, id=10.0.1.9-2000, heartbeat=0]
2015-10-11 14:43:04.667  INFO 69599 --- [        Timer-0] p2.server.service.GossipProtocolService  : Member [address=10.0.1.156:2000, id=10.0.1.156-2000, heartbeat=0] DOWN
2015-10-11 14:43:04.667  INFO 69599 --- [        Timer-1] p2.server.service.GossipProtocolService  : Member [address=10.0.1.9:2000, id=10.0.1.9-2000, heartbeat=0] DOWN
    public void initMembers() {
int heartbeat = 20;
        log.info("Initializing Gossip, with seeds: " + getSeeds().toString());

        for (String host: seeds) {
            GossipMember g = new RemoteGossipMember(host, this.port, host + "-"+ this.port, heartbeat);
            startupMembers.add(g);
        }

        try { 
            gossipService = new GossipService("127.0.0.1", port, "1-1", LogLevel.DEBUG, startupMembers, settings, this);
          } catch (UnknownHostException | InterruptedException e) {
            throw new RuntimeException(e);
          }
          gossipService.start();        

Convergence testing

We should have an IntegrationTest that does not run as part of the unit tests. I think the idea would be starting with cluster of size 2 - 100 launch clusters. Determine convergence in

  1. time to startup (all nodes see each other)
  2. Node down convergence
  3. Node back up convergence

Dead Node

Hi,

I using this lib to intergrate in my application. I saw a issue. when I shutdown a node and turn on again, other node don't know this node.
Example: I have 2 node 1,2. First I open all node. Second I turn off node 2 then turn on again but Node 2 don't know node 1 is UP and node 1 also don't know node 2 is UP.
Please help me resolve this issue.
Thank for your support.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.