doyoubi / undermoon Goto Github PK
View Code? Open in Web Editor NEWMordern Redis Cluster solution for easy operation.
License: Apache License 2.0
Mordern Redis Cluster solution for easy operation.
License: Apache License 2.0
Add another parameter to specify node ID in CLUSTER NODES
instead of using domains as the prefix of domains may be the same.
Current FutureGroupHandle
implementation requires it should be the outmost future directly fed into tokio::spawn
.
To support nested group, we need to send the signal in drop
function of FutureGroupHandle
.
There are two drawbacks for the current migration process based on the current approach - trigger replication between the source Redis and destination Redis:
right
data transferred by the formers.There are two solutions:
SCAN
, DUMP
, RESTORE
to mimic the replication.I don't want to use the first solution since there are just too many works for the RDB parsing. And we need to keep updating the codes as the RDB format changes.
The second one should be able to be compatible with the future versions of Redis.
The SCAN
command has a great property that it can guarantee that all the keys set before the first SCAN
command will finally be returned, for multiple times sometimes though. We can perform a 3 stage migration to mimic the replication.
SCAN
to peer Redis.But it also has some problems:
DUMP
again and again, which could result in a large amount of memory.Queue_block1
.SCAN
, DUMP
to get the data and add RESTORE
command to Queue1
. When SCAN
is done, mark the Queue1
as SENT_FINISHED
.SCAN
command gets the reply, release all the commands in Queue_block1
.Queue_block1
and the commands after the first SCAN
to another send function which for all the write requests, do the following with Lua script to support atomic:
DUMP
and forward the new data to Queue2
RESTORE
commands in Queue1
. Once the Queue1
is set SENT_FINISHED
and is empty, start to forward the RESTORE
commands in Queue2
.Queue1
is set SENT_FINISHED
and is empty, start to block the commands in Queue_block2
, wait for all the commands in Queue2
to be sent.Queue_block2
. Redirect all the keys inside migrated slots to destination proxy.INFOMGR
starts to return success.We can play the same commands script to both undermoon
and redis
and see whether the result is the same.
Support multiple brokers for
Support dynamic configuration through Redis api
Since #81
complex commands are much easier to implement than before.
We can implement blocking commands by transforming them to non-blocking commands.
For example, BLPOP
can be implemented by keeping calling LPOP
until it does not return Nil
.
After migration, we might get this error:
undermoon::common::resp_execution] error reply: LOADING Redis is loading the dataset in memory
The metadata storage needs to know the real epoch for server proxy for the following reasons:
Now the replicator module just keeps sending SLAVEOF
command to the backend Redis, resulting in the following log in Redis triggered again and again:
REPLICAOF would result into synchronization with the master we are already connected with. No operation performed.
Maybe we need to check whether the role is incorrect. But the address in ROLE
is replica-announce-ip
. We need to use CONFIG GET
to get the replica-annoucne-ip
first from the peer master.
Only support one cluster in mem_broker.
Now SETREPL only set out an asynchronous task to periodically send SLAVEOF
command to Redis.
This could result in a short time when a promoted master is still a slave but need to serve the requests.
We need to trigger SLAVEOF
once directly before SETREPL
returns.
Move compression inside DatabaseMap to suppress the warning for replica server proxy
failed to get config from {}. Use default config.
The original design of undermoon
is to support multiple logical clusters in a single server-side proxy to support multi-tenant. Now it turns out to be a bad idea for the following reasons:
I better remove the support for multiple logical clusters.
Build a future wrapper to track spawned futures to detect future leak like goroutine leak in Golang.
The main change in v0.3 will be broker API that will break compatibility. It will be done with overmoon v0.2
SlotRange
an arbitrary range for better migration performance.mem_broker
and make it usable.host
field in ProxyResource
of mem_broker
.mem_broker
data in a file.{:?}
exposed to the users.Should it be managed by the coordinator and server_proxy?
Now we use lag
from INFO
to determine whether the replication has finished, which is wrong. We should use master_repl_offset
and offset
of each replica.
At the first time, I think we will only deploy one proxy per host. So host
and proxy
are the same and are used interchangeably in the codes and API.
Now to support some clients and redis cluster proxies which do not support AUTH
command for the backend clusters, we need to deploy multiple proxies in the same machine to support multiple tenants.
I have changed the API in #21.
Later we need to change the host
in code to proxy
.
Undermoon
v0.2 focuses on supporting arbitrary slot migration, whose key functionality has been done in #65
There are still 2 problems need to solve:
The first one needs the Redis pipeline to increase the throughput. The second one needs complex synchronization. Both of them are not easy to implement without async
and await
.
Thus, in v0.2 Undermoon
will change to futures-rs
0.3.
CmdTask
support multiple commands as a single request.sendto
syscall.Resp
objects just store the index of the Redis packet to reduce memory allocation.When a server_proxy deleted an old cluster and create a new one, the existing connections are still tagged the old cluster name, which results in db not found
error.
Now, get_peer
in coordinator use separated HTTP calls to get the peer server proxies.
This could lead to inconsistent data.
We should return the metadata of peer proxies in /api/proxies/meta/<server_proxy_address>
directly.
Maybe we can use etcd
or zookeeper
as external storage for the case that the stored data are not large.
When proxies are tagged failed and recover again, the client pool in coordinator might get a stale connection and fails to send PING, which cause false negative failure report.
This is confusing but could be fine. Might be fixed later.
Since during the key migration, a key could be written from source shard to destination shard for multiple times, a key deleted by users could be recover again.
The overall process is:
RESTORE
command.RESTORE
command since SCAN
could generate the key for multiple times or the first migration is triggered actively by the destination shard. Then the deleted key recovers.Hi, I don't see any special use of this agent from the documentation.Can you explain it to me carefully?
email: [email protected]
When metadata of a large cluster are synchronized from HTTP broker to coordinator, from coordinator to server proxy, they may need to be compressed to eliminate the data size.
host
to proxy
database
to cluster
db_name
to cluster_name
If the whole undermoon
cluster has more than 100k server proxies, the coordinator might not be able to hold such amount of connections.
We need to divide coordinators into different shards by clusters and server proxies.
When scaling the proxies just migrate all the data from one to another, leaving the two involving proxies holding another half part of the data which they don't own.
Need to delete this data after migration by SCAN
and DELETE
command.
INFO
command.UMCTL INFOREPL
to INFO
command.Support real SELECT
command.
Support the official redis-cluster-proxy
Now we use UMFORWARD
command to support additional attributes to implement max_redirections
, which result in command wrapping and unwrapping and not the best performance.
Maybe we can implement RESP3
and use the attributes in RESPS3
to optimize it.
After migration, a task for deleting keys will be started and currently will cause some problems:
(1) Data Inconsistency When scaling up and down
Fixed by #158
If a cluster is scaling up and down frequently,
a migration task could be running with a task for deleting keys covering the same slots, which could result in losing some keys.
The PR 158 fix it by checking whether there's any deleting key task before starting a migration in the API.
(2) High CPU Usage
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.