Coder Social home page Coder Social logo

consistent-hashing's Introduction

Consistent-Hashing

Demo of consistent hashing.

Problem

A "sticky" or persistent load balancer aims to send a client (for example a person) to the same back-end server. This is typically done by hashing some client input and then doing a modulo operation on it with the number of back-end servers.

For example let's say "Jane" is trying to access a service with a sticky load balancer that is in front of 4 back-end servers. The client details (ex.: source IP, SSL Session Id, etc...) are hashed together and we perform a modulo operation on it. If we perform this with enough clients we'll get a roughly equal distribution. Perhaps like the following:

Hash("Jane") % 4 = Server2
Hash("Alan") % 4 = Server1
Hash("Lima") % 4 = Server3

What happens if a server goes down? That means we now do a modulo 3 operation. Since these are hashes, that means that just about all clients will now resolve to a different back-end server. We might end up with something like this:

Hash("Jane") % 3 = Server3
Hash("Alan") % 3 = Server2
Hash("Lima") % 3 = Server1

Goal

What we'd really like is that if a server goes down then only the clients that were resolving to that server get rebalanced across the remaining healthy servers. In the example above if Server3 goes down then we'd like Lima to go to a new server, but we'd still like Jane and Alan to go to the same servers that we had before.

Hash("Jane") % 4 = Server2
Hash("Alan") % 4 = Server1
Hash("Lima") % 4 = Server3 -> Server1

Solution

The way to achieve this is to imagine that we have a large circle (or a line that loops), with the servers placed more or less spread out evenly on the circle. In fact, for better distribution in case of fail-over we have multiple instances of each server located on this circle. If we envision this as a line that loops, with 2 "instances" of each server, it might look something like this:

 1 | 
 2 | Server1
 3 | 
 4 | Server2
 5 | 
 6 | Server3
 7 | 
 8 | Server1
 9 | 
10 | Server2
11 | 
12 | Server3

If a server goes down then we remove all instances of that server from our looped line... but the other server instances stay at their existing location. So for example if Server3 goes down then it might look like this:

 1 | 
 2 | Server1
 3 | 
 4 | Server2
 5 | 
 6 |
 7 | 
 8 | Server1
 9 | 
10 | Server2
11 | 
12 | 

If a server gets added then we find the biggest gap(s) in our looped line and add the server instances there. So for example if we add Server4 with 2 instances then it would be located as follows:

 1 | 
 2 | Server1
 3 | 
 4 | Server2
 5 | 
 6 | Server4 (new!)
 7 | 
 8 | Server1
 9 | 
10 | Server2
11 | 
12 | Server4 (new!)

When an incoming connection from a client comes in we map it to this circle (or line in our case) and then find the server instance that is next along the line. So for example if Jane wishes to connect she might result in a hash that gets placed at location 9 on our line:

Client | Index | Server
       |   1   | 
       |   2   | Server1
       |   3   | 
       |   4   | Server2
       |   5   | 
       |   6   | Server4
       |   7   | 
       |   8   | Server1
  Jane |   9   | 
       |  10   | Server2
       |  11   | 
       |  12   | Server4

The next server on our line would be Server 2.

Two things to note here:

  • In the future if Server1 or Server4 goes down, Jane will still go to Server2. That's exactly what we want!
  • In the future if Server2 goes down, then Jane will failover to the next server on the line, which would be Server4 in the example above.

In practice a server might have 100s of instances, as it more evenly distributes the affected clients across the remaining back-end servers.

As an additional benefit, if all of our back-end servers are not equal in terms of what load they can handle then we give them different weights (or instance counts).

Try it Yourself

Currently only a Java command-line interface (CLI) application has an implementation (my own) of consistent hashing so that you can try it yourself. Normally you'd try something like this:

  1. Optionally: Add some more clients (people) and servers
  2. Start removing servers, and confirm that most clients are unaffected.

consistent-hashing's People

Contributors

terrydu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.