Coder Social home page Coder Social logo

Add/remove node about dynomite HOT 17 CLOSED

netflix avatar netflix commented on July 20, 2024
Add/remove node

from dynomite.

Comments (17)

mfouilleul avatar mfouilleul commented on July 20, 2024 2

"With this advantage, one can simply add more nodes to a Dynomite cluster to meet traffic demands or loads."

I feel not understanding the "linearly scalable" concepts of dynomite, can you please give me some help on this part.

The list of seeds is predeclared before starting, the tokens for each node are calculated for a given number of nodes and also set statically in conf. When I want to add a node should I update all my configuration files and restart each nodes (dynomite) ? What about the data, how to "rebalance" them for the new topology?

I love the concept of dynomite is clearly sexy and I know that you use it for real at Netflix but I have to much interrogations, it's frustrating!!

Thanks

Max

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

Hi Max,

Sorry to reply late. For node replacement, we have an internal tool to support similar to the way we build Priam for Cassandra; we call it Florida. We will opensource this soon as it needs some clean-up due to the usage of some internal libraries.

There is a seeds_provider attribute that you can use to load a new seed list automatically. However,
currently we only have dyn_florida.c and it is loaded it at this line it is set:

https://github.com/Netflix/dynomite/blob/master/src/dyn_gossip.c#L816

I think in your case you can have a new seeds_provider that can just simply read from a local file a list of seeds and you update the file manually for new entries. For fancier way, you can have the code to read from a database or from another backend service, etc.

Please contribute this module if you are interested as we don't have the time to do this now. Internally, we use florida_provider (dyn_florida.c) and it satisfies us enough.

Then to warm up your new node, you can read this issue:

#51

Our internal tool, Florida that I mentioned earlier, will do all of those works for us. We will move those works into Dynomite itself so we can keep Florida to be simple. For now, you can write a Python/Ruby script to help out. If you are interested, I can assist you with this further.

For your 2nd question, to add new nodes into a running cluster, there are 2 ways:

  1. Add those nodes into the seed list (well, it convolutes but it works)
  2. Turn on gossip. Let each of those nodes talk to at least one of the running nodes.
    Eventually, all existed nodes will recognize the new nodes (eventually in a few seconds).

Since each node can warm up with some data, they will have some data. However, since our warming data technique currently will bring extra data, we need to run a tool to purge out unowned data. Also we need to run the same clean-up tool to purge out data on its neighboring nodes.

We are still working to make those steps to be as simple as possible.

I hope to answer all your questions.

Thanks.
Minh

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

Max,

Just want to add, to make your cluster bigger (scale out), I suggest you best to estimate your cluster capacity now to accomodate the growth for the next 6months. This will give us enough time to push out all the codes to make these steps simplier.

In the worst case that we don't have these codes, you can leverage "dual writes" technique. Your application writes into both Dynomite clusters (old/small one and new/bigger one). Then, you switch your traffics completely to the new cluster after a few days.

from dynomite.

mfouilleul avatar mfouilleul commented on July 20, 2024

Hi Minh,

Thanks for you answers. I'm actually testing several things for our
key/value needs. The main goal is to make Redis scalable because our
business is growing fast so we need to safely scale our datastores.

I'm testing Dynomite, can you please say me if I'm doing wrong.

192.168.33.20 : Node 1 : DC1, RC1
192.168.33.21 : Node 2 : DC1, RC1
192.168.33.22 : Node 3 : DC1, RC1
192.168.33.23 : Node 4 : Future Node
192.168.33.12 : HAProxy & Test Scripts

I built a simple Perl script to assign token according to the number of
nodes.

---

my $nodes = $ARGV[0];

for(my $i = 0; $i < $nodes; $i++){
my $t = int($i*(4294967295)/int($nodes));
print "Node " . ($i+1) . ": $t\n";
}

---

perl token.pl 3
Node 1: 0
Node 2: 1431655765
Node 3: 2863311530

Configuration file Node 1:
bbc_dyno:

Node Position

datacenter: dc1
rack: rc1

Redis

redis: true
listen: 0.0.0.0:8102
servers:

  • 127.0.0.1:6379:1

    Cluster

    tokens: 0
    dyn_listen: 0.0.0.0:8101
    dyn_seed_provider: simple_provider
    dyn_seeds:

  • 192.168.33.21:8101:rc1:dc1:1431655765

  • 192.168.33.22:8101:rc1:dc1:2863311530

    Liveness

    auto_eject_hosts: true
    server_retry_timeout: 30000
    server_failure_limit: 3
    timeout: 400

    Security

    pem_key_file: /opt/dynomite/conf/dynomite.pem

Configuration file Node 2:
bbc_dyno:

Node Position

datacenter: dc1
rack: rc1

Redis

redis: true
listen: 0.0.0.0:8102
servers:

  • 127.0.0.1:6379:1

    Cluster

    tokens: 1431655765
    dyn_seed_provider: simple_provider
    dyn_listen: 0.0.0.0:8101
    dyn_seeds:

  • 192.168.33.20:8101:rc1:dc1:0

  • 192.168.33.22:8101:rc1:dc1:2863311530

    Liveness

    auto_eject_hosts: true
    server_retry_timeout: 30000
    server_failure_limit: 3
    timeout: 400

    Security

    pem_key_file: /opt/dynomite/conf/dynomite.pem

Configuration file Node 3:
bbc_dyno:

Node Position

datacenter: dc1
rack: rc1

Redis

redis: true
listen: 0.0.0.0:8102
servers:

  • 127.0.0.1:6379:1

    Cluster

    tokens: 2863311530
    dyn_seed_provider: simple_provider
    dyn_listen: 0.0.0.0:8101
    dyn_seeds:

  • 192.168.33.20:8101:rc1:dc1:0

  • 192.168.33.21:8101:rc1:dc1:1431655765

    Liveness

    auto_eject_hosts: true
    server_retry_timeout: 30000
    server_failure_limit: 3
    timeout: 400

    Security

    pem_key_file: /opt/dynomite/conf/dynomite.pem

Everything is doing well, when I set 1000 keys I have 343 keys on Node 1,
343 keys on Node 2 and 314 on Node 3.

My questions are, please tell me if I'm wrong.

  • Should I declared all nodes on "dyn_seeds" section?
  • Should I start all Dynomite nodes with gossip enable (-g) ?
  • Should I choose simple_provider or florida_provider for the moment ?
  • Turning gossip on need a restart (-g option)?
  • If I want to add a forth node in my rack, the process is not fully
    supported yet by Dynomite is that right ?
    • Choose new tokens for 4 nodes topology
    • Update dynomite.cnf with new values for "tokens" and "dyn_seeds"
    • Rebalance dataset for the new topology (I do not have any idea for
      doing that)
    • Restart all dynomite nodes to apply new topology.

Thanks again.

Max.

2014-12-29 20:51 GMT+01:00 Minh Do [email protected]:

Max,

Just want to add, to make your cluster bigger (scale out), I suggest you
best to estimate your cluster capacity now to accomodate the growth for the
next 6months. This will give us enough time to push out all the codes to
make these steps simplier.

In the worst case that we don't have these codes, you can leverage "dual
writes" technique. Your application writes into both Dynomite clusters
(old/small one and new/bigger one). Then, you switch your traffics
completely to the new cluster after a few days.


Reply to this email directly or view it on GitHub
#69 (comment).

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

Max,

Here are my answers:

  1. Should I declared all nodes on "dyn_seeds" section?
    Yes, for a small cluster, you don't need gossip so it is better to set all the nodes as seed nodes to avoid all the potential gossip bugs (we are trying to test and resolve all of them but there might be bugs). Gossip is just for detecting nodes each other and by declaring all of them in the config files, we don't need gossip.
  2. Should I start all Dynomite nodes with gossip enable (-g) ?
    "-g" enables gossip. So see answer in #1
  3. Should I choose simple_provider or florida_provider for the moment ?
    You should pick simple_provider as we have n't released out Florida yet.
    Simple_provider is basically just reading the nodes from the config file and do nothing else.
    As mention earlier, you can provide a new method "file_provider" to read node list from a file and Dynomite will periodically load the new nodes from the given file (we don't have this module yet)
  4. Turning gossip on need a restart (-g option)?
    It is a command line option so yes. You can submit a patch for this if you like to configure while a node is running. There is an admin port that listen to cmds being sent to that port. So in theory, we could define a new command to change this variable without restarting the nodes
  5. If I want to add a forth node in my rack, the process is not fully
    supported yet by Dynomite is that right ?
    As said, give us more time to make this transparent. Otherwise, use "dual writes" technique to upgrade to a larger Dynomite cluster
  6. Choose new tokens for 4 nodes topology/Rebalance data
    Better use dual writes at this time. Changing tokens of running nodes requires data purging and moving. We are not there yet. For Netflix to operate Cassandra (yeah Cassandra), we don't even just add a node in scaling-out an existing cluster. We always double a cluster so data balancing is easier as for every newly added node, we only have to touch 2 neighbor nodes and we keep doing this untill we fully double the cluster. Adding just one node into an existing cluster requires a data rebalancing on all nodes and this is complicated and could hurt your overal latencies. So we might do similarly for Dynomite.
  7. Restart all dynomite nodes to apply new topology?
    Yes, if you change config file, you need a restart. So use dual writes so you don't have to do this as the new topology is already defined in the new cluster prior to any traffics.

from dynomite.

mfouilleul avatar mfouilleul commented on July 20, 2024

Hi,

Thanks for all you responses. So to be safe I will wait to enable all "auto stuffs" fonctionnalities. In its simple form (statically described in conf file), is Dynomite production ready? Do you use it for real at Netflix?

Best,

Max.

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

@mfouilleul - sounds good. We are going through a series of load tests to prepare for a big deployment with several clusters currently and will deploy on prod very soon.

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

@mfouilleul btw, if you need to supply a list of peers that can be dynamically changed, you can set seed_provider: florida_provide in dynomite.yml.

Dynomite will load this dyn_florida.c which will keep polling a local rest service to return a list of peers.
You can read more about the local service in this file:
https://github.com/Netflix/dynomite/blob/master/src/seedsprovider/dyn_florida.c#L11

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

@mfouilleul just want to update you that we just rolled out 2 big clusters to Prod. Testing shows that a 3 region cluster (in AWS us-east-1, us-west-2 and eu-west-1) of about 40 nodes in each region can handle 500K read/write RPS with average latency under 1ms and 99th latency around 2.5ms. We will post the graphs later for more details.

Btw, my mind has been stuck in coding so I completely forgot to mention that we do have several small Dynomite clusters in Prod prior to the New Year 2015 but they are for smaller traffics.

from dynomite.

mfouilleul avatar mfouilleul commented on July 20, 2024

Sounds really good! Thanks for all you updates.

Max.

from dynomite.

mfouilleul avatar mfouilleul commented on July 20, 2024

Hello,

I made a simple test:

I have two nodes:

  • 192.168.33.40
  • 192.168.33.41

On the 40, I start a simple dynomite with the conf:

dyn_o_mite:
  # Node Location
  env: network
  datacenter: dc1
  rack: rc1

  # Dynomite
  listen: 0.0.0.0:8102
  dyn_listen: 0.0.0.0:8101
  dyn_seed_provider: florida_provider
  preconnect: true

  # Node Info
  redis: true
  tokens: 0
  servers:
  - 127.0.0.1:6379:1

  # Time Management
  gos_interval: 10000
  timeout: 3000
  server_retry_timeout: 3000

  # Security
  secure_server_option: datacenter
  server_connections: 1
  auto_eject_hosts: true
  server_failure_limit: 3
  pem_key_file: /etc/dynomite/dynomite.pem`

Now I start the 41 in the rack=rc2 and I specified the 40 in the seed list:

dyn_o_mite:
  # Node Location
  env: network
  datacenter: dc1
  rack: rc2

  # Dynomite
  listen: 0.0.0.0:8102
  dyn_listen: 0.0.0.0:8101
  dyn_seed_provider: florida_provider
  preconnect: true

  # Seeds
  dyn_seeds:
  - 192.168.33.40:8101:rc1:dc1:0

  # Node Info
  redis: true
  tokens: 0
  servers:
  - 127.0.0.1:6379:1

  # Time Management
  gos_interval: 10000
  timeout: 3000
  server_retry_timeout: 3000

  # Security
  secure_server_option: datacenter
  server_connections: 1
  auto_eject_hosts: true
  server_failure_limit: 3
  pem_key_file: /etc/dynomite/dynomite.pem`

When I write on the 41 (port 8102), the keys are well replicated to the 40 (because it knows the topology thanks to the seeds I guess) but when I write on 40, nothing append.

Can you tell me I misunderstand something because, in my mind the gossiping + the florida_provider serve to discover this kind of topology.

Note that I see some ineresting debugs in the log when I start my dynomites:

dyn_gossip.c:847 What?? No rack in Dict for rack         : 'dc1'
dyn_florida.c:98 Unable to connect the destination

Max.

from dynomite.

timiblossom avatar timiblossom commented on July 20, 2024

For the node 40, since you don't list out a seed node and you ask it to use florida_provider while you don't have a local service, it can't get a seed node.

There are two ways to pull out a seed node. Once is from the yaml file and the other is from the local service. If you read the comment on the top of dyn_florida_provider.c, you need to provide a local service: https://github.com/Netflix/dynomite/blob/master/src/seedsprovider/dyn_florida.c#L13

And in that local service, you can dynamically map token list to node list.

Here is a small nodejs code that can act as a local service for florida_provider:

var express = require('express');

var app = express.createServer();

app.get('/REST/v1/admin/get_seeds', function (req, res) {
    res.writeHead(200, {'Content-Type': 'application/json'});
    var content = '127.0.0.3:8101:rack:dc:12345678|127.0.0.2:8101:rack:dc:1383429731';
    res.end(content);
});

app.listen(8080);

from dynomite.

mfouilleul avatar mfouilleul commented on July 20, 2024

Thanks Minh.

Max

On 25 Feb 2015, at 19:40, Minh Do [email protected] wrote:

For the node 40, since you don't list out a seed node and you ask it to use florida_provider while you don't have a local service, it can't get a seed node.

There are two ways to pull out a seed node. Once is from the yaml file and the other is from the local service. If you read the comment on the top of dyn_florida_provider.c, you need to provide a local service: https://github.com/Netflix/dynomite/blob/master/src/seedsprovider/dyn_florida.c#L13

And in that local service, you can dynamically map token list to node list.

Here is a small nodejs code that can act as a local service for florida_provider:

var express = require('express');

var app = express.createServer();

app.get('/REST/v1/admin/get_seeds', function (req, res) {
res.writeHead(200, {'Content-Type': 'application/json'});
var content = '127.0.0.3:8101:rack:dc:12345678|127.0.0.2:8101:rack:dc:1383429731';
res.end(content);
});

app.listen(8080);

Reply to this email directly or view it on GitHub.

from dynomite.

mfouilleul avatar mfouilleul commented on July 20, 2024

I just completed the node script to take in argument a file with seeds:

var http = require('http');
var url = require('url');

var seeds_file_path_arg = process.argv.slice(2);

var seeds_file = require('fs');
var seeds_file_path = '/etc/dynomite/seeds.list';

if(typeof seeds_file_path_arg == 'undefined' || seeds_file_path_arg == null || seeds_file_path_arg  == ''){
  seeds_file_path = '/etc/dynomite/seeds.list';
} else{
  seeds_file_path = seeds_file_path_arg;
}

var server = http.createServer(function(req, res) {
  var path = url.parse(req.url).pathname;
  res.writeHead(200, {"Content-Type": "application/json"});
  if (path == '/REST/v1/admin/get_seeds') {
    data = seeds_file.readFileSync(seeds_file_path).toString();
    data_oneline = data.trim().replace(/\n/g, '|');
    var now = new Date();
    var jsonDate = now.toJSON();
    console.log(jsonDate + " - get_seeds [" + data_oneline + "]");
    res.write(data_oneline);
  }
  res.end();
});
server.listen(8080);

With the seed file for the node 192.168.33.40:

cat /etc/dynomite/seeds.list
192.168.33.41:8101:rc1:dc1:2147483647
192.168.33.42:8101:rc2:dc1:0
192.168.33.43:8101:rc2:dc1:2147483647

Start the node script as:

node get_seeds.js /etc/dynomite/seeds.list

Max.

from dynomite.

thegreathir avatar thegreathir commented on July 20, 2024

Hi,
As I recognized there is a tool called Florida that will rebalance the data after add/remove a node.
I am planning to use dynomite and I can not use dyno client because my clients are developed in C,
I want to know will you opensource Florida and if yes when will you do this?

from dynomite.

shailesh33 avatar shailesh33 commented on July 20, 2024

Florida was a previous name, it is now called dynomite-manager and it is already open source. https://github.com/Netflix/dynomite-manager/

from dynomite.

smukil avatar smukil commented on July 20, 2024

Closing since it's dated.

from dynomite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.