I've been thinking about ways to coordinate failover and ensure that all clients move

Discussion: Use a barrier to ensure all clients move at the same time about redis_failover HOT 6 CLOSED

ryanlecompte commented on August 19, 2024

Discussion: Use a barrier to ensure all clients move at the same time

from redis_failover.

Comments (6)

ryanlecompte commented on August 19, 2024

This is an interesting approach and also something that I've been thinking about. The current workflow is like this:

When a master is promoted, we first delete the config znode.
Clients should receive the znode deleted event if they are in an active quorom (if not, they should be in a disconnected / session expired state - and in this case we actively purge the client's list of redis clients since it can't be trusted when disconnected)
Node Manager promotes a new master
The config znode is created, so new clients get the watcher event that the znode can be read from again and redis clieints can be rebuilt.

I added a safety net in RedisFailover::Client so that if the last event heard from the NodeManager was greater than 9 seconds (the normal session expiration time), then it shouldn't trust its list of redis clients. One brute force way to ensure that all clients have the latest list would be to simply wait the sufficient time (9 seconds) in the Node Manager before promoting the new master. This would ensure that all clients either a) receive the watcher event and start using the new config or b) remain dormant until they reconnect and hear from the Node Manager again.

Right now the Node Manager reports znode updates every 3 seconds, and thus the clients know that they should be hearing from the Node Manager via watcher events frequently.

I'll give some thought to your approach also. @slyphon, any thoughts?

from redis_failover.

slyphon commented on August 19, 2024

So, what's interesting is that this is what I was thinking when we discussed using the read-only slave option, which I think may have some advantages over this, mainly in terms of safety.

config is empty
node manager promotes server A master
node manager makes server B and C slave to A
node manager writes config
clients see config creation read and connect

So now the question is, what kind of failure scenario are you imagining?

In the case when the redis master dies, you have an unavoidable race: some portion of your clients will see the failure immediately (as they will error), and the other portion may be handled by the node manager.

node manager sees the master is down
node manager deletes the config
clients all see config missing, go into locked state
node manager points slave C to B (nothing is writable at this point)
node manager promotes B to master (making it writable)
node manager writes the config
clients see the config
clients connect to new master
(what happens to A when it comes back up?)

I think the disadvantage to the barrier is that you're increasing your chances that the whole cluster will fail. If you require that all connected clients are in a certain state before everyone can act, then you increase your risk of failure for each client you add. In a cluster of 8 nodes, you only need one client deadlocked in some awesometastic way for the whole cluster to only see 7/8 clients ready to move...indefinitely. If you have 16 clients, you're 8x as likely to fail as with 8 (at least, I'm not sure, it may be exponential or something, but it's more). Then to mitigate that you start getting into really application-specific decisions about how many clients do you need aware of the change before you have enough of a quorum to continue.

With a barrier in place, you're making the control more centralized instead of distributed (which adds complexity).

By embedding the logic in each client to just "do the right thing," you are more likely to succeed (especially if the user is somewhat smart in their design and over-provisioned), in that even if some portion of the clients hit the deadlock case, if you're lucky, the majority haven't and can continue.

An anecdote. We used to try to coordinate filesystem operations with mysql database operations. Someone had the bright idea that you should wrap it all in a transaction, that way if the file operation raised an exception, it rolls back the database transaction and everything is kosher. That filesystem was NFS in production, and the first time the NAS went down, mysql went into a death spiral (as all the connections had transactions that were open and would NEVER be released, and couldn't be killed). Needless to say: hilarity ensued. I likened this to someone saying, "When I go running, I like to tie my shoelaces to my scrotum, that way in case my shoe slips off my foot I don't lose it!"

Be careful when adding hard dependencies :)

If you make it so the clients can only write to one redis node in the cluster, I don't think you're adding safety with the barrier. The argument for a barrier would be an application specific requirement that all available nodes have acknowledged the change before continuing. I'd say that if you don't have this requirement, then the added complexity isn't worth it. From a redis_failover feature position, I'd say make it optional for people.

With the current system, and if you add the barrier, it occurred to me that you have the problem of a thundering herd: you wind up with all the clients connecting simultaneously to the new master and pounding it with requests. This could be mitigated by a small random sleep before connecting after seeing a new config, and might be a decent optional feature for people to enable.

On a related-note (it just occured to me) one additional feature to consider is flap detection. If there are more than N config changes in M seconds, the clients delay for X seconds before re-connecting.

Anyway, I have a feeling I may be not addressing the issue head-on, but that's what all this made me think about. :)

from redis_failover.

ryanlecompte commented on August 19, 2024

Thanks for sharing your thoughts, Jonathan. One change that I just made as a result of re-reading the original approach was to only promote the new candidate master after all of the existing nodes have been switched to be slaves of it. Only then do we promote the candidate to being a master. That was just done here in head: 6acb6f4

from redis_failover.

eric commented on August 19, 2024

You're really right about adding this coordination doesn't make sense for the failure case.

I think I've been conflating a couple different scenarios. The situation I was trying to solve in the above proposal was the one of an intentional failover to do upgrades, etc.

One of the biggest things I've been frustrated redis does not have is a simple way to mark a redis node as read-only. If that one building block were there, it would be trivial to ensure you didn't lose writes.

Until such a thing exists, we are forced with deciding between different terrible solutions.

Maybe it's better to structure this as though we had the read-only flag and not add complication before then.

If that were the case, there shouldn't be a big reason to make the znode pointing to the active nodes ephemeral — I would want my clients to be able to access the redis server even if the node watcher weren't running for some reason.

from redis_failover.

ryanlecompte commented on August 19, 2024

As soon as Redis 2.6 is supported and out in the open, we could add support
in redis_failover for setting the read_only flag:

"Since Redis 2.6 slaves support a read-only mode that is enabled by
default. This behavior is controlled by the slave-read-only option in the
redis.conf file, and can be enabled and disabled at runtime using CONFIG SET
."

http://redis.io/topics/replication

On Mon, Apr 23, 2012 at 6:40 PM, Eric Lindvall <
[email protected]

wrote:

You're really right about adding this coordination doesn't make sense for
the failure case.

I think I've been conflating a couple different scenarios. The situation I
was trying to solve in the above proposal was the one of an intentional
failover to do upgrades, etc.

One of the biggest things I've been frustrated redis does not have is a
simple way to mark a redis node as read-only. If that one building block
were there, it would be trivial to ensure you didn't lose writes.

Until such a thing exists, we are forced with deciding between different
terrible solutions.

Maybe it's better to structure this as though we had the read-only flag
and not add complication before then.

If that were the case, there shouldn't be a big reason to make the znode
pointing to the active nodes ephemeral — I would want my clients to be able
to access the redis server even if the node watcher weren't running for
some reason.

Reply to this email directly or view it on GitHub:

#9 (comment)

from redis_failover.

ryanlecompte commented on August 19, 2024

No longer need to keep this issue open.

from redis_failover.

Discussion: Use a barrier to ensure all clients move at the same time about redis_failover HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent