Coder Social home page Coder Social logo

redis-cluster-proxy's Introduction

Current status

This project is currently not actively maintained. It is alpha code that was indented to be evaluated by the community in order to get suggestions and contributions. We discourage its usage in any production environment.

Redis Cluster Proxy

Redis Cluster Proxy is a proxy for Redis Clusters. Redis has the ability to run in Cluster mode, where a set of Redis instances will take care of failover and partitioning. This special mode requires the use of special clients understanding the Cluster protocol: by using this Proxy instead the Cluster is abstracted away, and you can talk with a set of instances composing a Redis Cluster like if they were a single instance. Redis Cluster Proxy is multi-threaded and it currently uses, by default, a multiplexing communication model so that every thread has its own connection to the cluster that is shared to all clients belonging to the thread itself. Anyway, in some special cases (ie. MULTI transactions or blocking commands), the multiplexing gets disabled and the client will have its own cluster connection. In this way clients just sending trivial commands like GETs and SETs will not require a private set of connections to the Redis Cluster.

So, these are the main features of Redis Cluster Proxy:

  • Routing: every query is automatically routed to the correct node of the cluster
  • Multithreaded
  • Both multiplexing and private connection models supported
  • Query execution and reply order are guaranteed even in multiplexing contexts
  • Automatic update of the cluster's configuration after ASK|MOVED errors: when those kinds of errors occur in replies, the proxy automatically updates its internal representation of the cluster by fetching an updated configuration of it and by remapping all the slots. All queries will be re-executed after the update is completed, so that, from the client's point-of-view, everything flows as normal (the clients won't receive the ASK|MOVED error: they will directly receive the expected replies after the cluster configuration has been updated).
  • Cross-slot/Cross-node queries: many commands involving multiple keys belonging to different slots (or even to different cluster nodes) are supported. Those commands will split the query into multiple queries that will be routed to different slots/nodes. Reply handling for those commands is command-specific. Some commands, such as MGET, will merge all the replies as if they were a single reply. Other commands such as MSET or DEL will sum the results of all the replies. Since those queries actually break the atomicity of the command, their usage is optional (disabled by default). See below for more info.
  • Some commands with no specific node/slot such as DBSIZE are delivered to all the nodes and the replies will be map-reduced in order to give a sum of all the values contained in all the replies.
  • The additional PROXY command that can be used to perform some proxy-specific actions.

Build

Redis Cluster Proxy should run without issues on most POSIX systems (Linux, macOS/OSX, NetBSD, FreeBSD) and on the same platforms supported by Redis.

Anyway, it requires C11 and its atomic variables, so please ensure that your compiler is supporting both C11 and atomic variables (_Atomic). As for GCC, those features are supported by version 4.9 or later.

In order to build it, just type:

% make

If you need a 32-bit binary, use:

% make 32bit

If you need a verbose build, use the V option:

% make V=1

If you need to rebuild dependencies, use:

% make distclean

And, finally, if you want to launch tests, just type:

% make test

Note: by default, tests use the redis-server that is installed on your system (the one that is found in your $PATH). If you need to use another redis-server, use the environment variable REDIS_HOME, ie:

% REDIS_HOME=/path/to/my/redis/src make test

As you can see, the make syntax (but also the output style) is the same used in Redis, so it will be familiar to Redis users.

Install

In order to install Redis Cluster Proxy into /usr/local/bin just use:

% make install

You can use make PREFIX=/some/other/directory install if you wish to use a different destination.

Usage

Redis Cluster Proxy attaches itself to an already running Redis cluster. The binary will be compiled inside the src directory. The basic usage is:

./redis-cluster-proxy CLUSTER_ADDRESS

where CLUSTER_ADDRESS is the host address of any cluster's instance (we call it the entry point), and it can be expressed in the form of an IP:PORT for TCP connections, or as UNIX socket by specifying the file name.

For example:

./redis-cluster-proxy 127.0.0.1:7000

./redis-cluster-proxy /path/to/entry-point.socket

It is also possible to specify more entry-points as multiple addresses. The proxy will use the first reachable entry-point in order to connect to the cluster and fetch the configuration of the cluster itself. This can be useful since a single entry-point could be down, so you can use multiple addresses to make the proxy more reliable.

Example:

./redis-cluster-proxy 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002

If you need a basic help, just run it with the canonical -h or --help option.

./redis-cluster-proxy -h

By default, Redis Cluster Port will listen on port 7777, but you can change it with the -p or --port option. Furthermore, by default, Redis Cluster Port will bind all available network interfaces to listen to incoming connections. You can bind to specific interfaces by using the --bind options. You can bind a single interface or you can bind multiple interfaces by using the --bind option more the once.

You can also tell Redis Cluster Port to listen upon a UNIX socket, by using the --unixsocket option to specify the socket filename and, optionally, the --unixsocketperm to set socket file permissions.

If you want to only listen on the UNIX socket, set --port to 0, so that the proxy won't listen on TCP sockets at all.

Examples:

Listen on port 7888

./redis-cluster-proxy --port 7888 127.0.0.1:7000

Listen on default port and bind only 127.0.0.1:

./redis-cluster-proxy --bind 127.0.0.1 127.0.0.1:7000

Listen on port 7888 and bind multiple interfaces:

./redis-cluster-proxy --port 7888 --bind 192.168.0.10 --bind 10.0.0.10 127.0.0.1:7000

Listen on UNIX socket and disable TCP connections

./redis-cluster-proxy --unixsocket /path/to/proxy.socket --port 0 127.0.0.1:7000

You can change the number of threads using the --threads option.

You can also use a configuration file instead of passing arguments by using the -c options, ie:

redis-cluster-proxy -c /path/to/my/proxy.conf 127.0.0.1:7000

You can find an example proxy.conf file inside the main Redis Cluster Proxy's directory.

After launching it, you can connect to the proxy as if it were a normal Redis server (however make sure to understand the current limitations).

You can then connect to Redis Cluster Proxy using the client you prefer, ie:

redis-cli -p 7777

Private connections pool

Every thread has its own connections pool that contains ready-to-use private connections to the cluster, whose sockets are pre-connected in the same moment they are created. This allows clients requiring private connections (ie. after commands such as MULTI or blocking commands) to immediately use a connection that is probably already connected to the cluster, instead of reconnecting to the cluster from scratch (a situation that could slow-down the sequence execution of the queries from the point-of-view of the client itself). Every connection pool has a predefined size, and it's not allowed to create more connections than those allowed by its size. The size of the connection pool can be configured via the --connections-pool-size option (by default it's 10). When the pool runs out of connections, every new client requiring a private connection will create a new private connection from scratch and it will have to connect to the cluster and wait for the connection to be established. In this case, the connection model will be "lazy", meaning that the sockets of the new connection will connect to a particular node of the cluster only when the query will require a connection to that node. Every thread will re-populate its own pool after the number of connections will drop below the specified minimum, that by default is the same of the size of the pool itself, and that can be configured via the --connections-pool-min-size option. The population rate and interval can be defined by the --connections-pool-spawn-every (interval in milliseconds) and --connections-pool-spawn-rate (number of new connection at every interval).

So:

redis-cluster-proxy --connections-pool-size 20 connections-pool-min-size 15 --connections-pool-spawn-rate 2 --connections-pool-spawn-every 500 127.0.0.1:7000

Means: "create a connection pool containing 20 connections (maximum), and re-populate it when the number of connections drops below 15, by creating 2 new connections every 500 milliseconds".

Remember that every pool will be completely populated when the proxy starts. It's also important to remark that when clients owning a private connection will disconnect, their thread will try to recycle their private connection in order to add it again to the pool if the pool itself is not already full.

Password-protected clusters and Redis ACL

If your cluster nodes are protected with a password, you can use the -a, --auth command-line options or the auth option in a configuration file in order to specify an authentication password. Furthermore, if your cluster is using the new ACL implemented in Redis 6.0 and it has multiple users, you can even authenticate with a specific user by using the --auth-user command-line option (or auth-user in a config file) followed by the username. Examples:

redis-cluster-proxy -a MYPASSWORD 127.0.0.1:7000

redis-cluster-proxy --auth MYPASSWORD 127.0.0.1:7000

redis-cluster-proxy --auth-user MYUSER --auth MYPASSWORD 127.0.0.1:7000

The proxy will use these credentials to authenticate to the cluster and fetch the cluster's internal configuration, but it will also automatically authenticate all clients with the provided credentials. So, all clients that will connect to the proxy will be automatically authenticated with the user that is specified by --auth-user or with the default user if no user has been specified, without the need to call the AUTH command by themselves. Anyway, if any client wants to be authenticated with a different user, it always can call the Redis AUTH command (documented here): in this case, the client will use a private connection instead of the shared, multiplexed connection, and it will authenticate with another user.

Enabling cross-slots queries

Cross-slots queries use multiple keys belonging to different slots or even different nodes. Since their execution is not guaranteed to be atomic (so, they can actually break the atomic design of many Redis commands), they are disabled by default. Anyway, if you don't mind about atomicity and you want this feature, you can enable it when you launch the proxy by using the --enable-cross-slot, or by setting enable-cross-slot yes into your config file. You can also activate this feature while the proxy is running by using the special PROXY command (see below).

Note: cross-slots queries are not supported by all the commands, even if the feature is enabled (ie. you cannot use it with EVAL or ZUNIONSTORE and many other commands). In that case, you'll receive a specific error reply. You can fetch a list of commands that cannot be used in cross-slots queries by using the PROXY command (see below).

The PROXY command

The PROXY command will allow you to get specific info or perform actions that are specific to the proxy. The command has various subcommands, here's a little list:

  • PROXY CONFIG GET|SET option [value]

    It can be used to get or set a specific option of the proxy, where the options are the same used in the command line arguments (without the -- prefix) or specified in the config file. Not all the options can be changed (some of them, ie. threads, are read-only).

    Examples:

    PROXY CONFIG GET threads
    PROXY CONFIG SET log-level debug
    PROXY CONFIG SET enable-cross-slot 1
    
  • PROXY MULTIPLEXING STATUS|OFF

    Get the status of multiplexing connection model for the calling client, or disable multiplexing by activating a private connection for the calling client. Examples:

    -> PROXY MULTIPLEXING STATUS
    -> Reply: "on"
    -> PROXY MULTIPLEXING off
    
  • PROXY INFO

    Returns info specific to the cluster, similarly to the INFO command in Redis.

  • PROXY COMMAND [UNSUPPORTED|CROSSSLOTS-UNSUPPORTED]

    Returns a list of all the Redis commands handled (known) by Redis Cluster Proxy, in a similar fashion to Redis COMMAND function. The returned reply is a nested Array: every command will be an item of the top-level array and it will be an array itself, containing the following items: command name, arity, first key, last key, key step, supported. The last item ("supported") indicates whether the command is currently supported by the proxy.

    The optional third argument can be used as a filter, with the following options:

    • UNSUPPORTED: only lists unsupported commands
    • CROSSSLOTS-UNSUPPORTED: only lists commands that cannot be used with cross-slots queries, even if cross-slots queries have been enabled in the proxy's configuration.
  • PROXY CLIENT

    Perform client-specific actions, ie:

    • PROXY CLIENT ID: get the current client's internal ID

    • PROXY CLIENT THREAD: get the current client's thread

  • PROXY CLUSTER [subcmd]

    Perform actions related to the cluster associated with the calling client, ie:

    • PROXY CLUSTER or PROXY CLUSTER INFO Get info about the cluster. Info is an array whose elements are in the form of name/value pairs, where the names are specific features such as status, connection, and so on. You can also retrieve info for a single specific feature, ie. by calling PROXY CLUSTER STATUS. Below there's a list of common info that can be retrieved:

      • status: Current status of the cluster, can be updating, updated or broken
      • connection: Connection type, that can be shared if the client is working inside a multiplexing context (so the connection is shared with all the clients of the thread), or private if the client is using its own private connection.
      • nodes: A nested array containing the list of all the master nodes of the cluster. Every node is another nested array, containing name/value pairs.
    • PROXY CLUSTER UPDATE: request an update of the current cluster's configuration.

    Examples:

    -> PROXY CLUSTER
    
    1) status
    2) updated
    3) connection
    4) shared
    5) nodes
    6) 1)  1) name
           2) 8d829c8b66f67dd9c4adad16e5c0a4c82aadd810
           3) ip
           4) 127.0.0.1
           5) port
           6) (integer) 7002
           7) slots
           8) (integer) 5462
           9) replicas
          10) (integer) 1
          11) connected
          12) (integer) 1
      ...
    
    
  • PROXY LOG [level] MESSAGE

    Log MESSAGE to Proxy's log, for debugging purposes.

    The optional level can be used to define the log level:

    debug, info, success, warning, error (default is debug)

  • PROXY DEBUG

    Perform different actions for debugging purpose, where subcommand can be:

    • SEGFAULT: crash the proxy with sigsegv

    • ASSERT: crash the proxy with an assertion failure

  • PROXY SHUTDOWN [ASAP]

    Shutdown the proxy. The optional ASAP option makes the proxy exit immeditely (dirty exit).

  • PROXY HELP

    Get help for the PROXY command

Commands that act differently from standard Redis commands or that have special behavior

  • PING: PONG is replied directly by the proxy
  • MULTI: disables multiplexing for the calling client by creating a private connection in the client itself. Note: since it's required to be atomic, cross-slots queries cannot work inside a multi transaction.
  • DBSIZE: sends the query to all nodes in the cluster and sums their replies, so that the result will be the total number of keys in the whole cluster.
  • SCAN: performs the scan on all the master nodes of the cluster. The cursor contained in the reply will have a special four-digits suffix indicating the index of the node that has to be scanned. Note: sometimes the cursor could be something like "00001", so you mustn't convert it to an integer when your client has to use it to perform the next scan.

For a list of all known commands (both supported and unsupported) and their features, see COMMANDS.md

redis-cluster-proxy's People

Contributors

artix75 avatar git-hulk avatar huangzhenliang avatar itamarhaber avatar oranagra avatar shooterit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

redis-cluster-proxy's Issues

[Fa

sorry for miss operation.
Please delete this issue.

Ability to specify several entrypoints to the cluster

Hi there, @artix75!

First of all, this project looks like it could eliminate many pain points with using a Redis cluster for us, super exciting! Thanks a lot for working on this.

I've taken it for a test drive with our Redis cluster deployment and noticed a few things that would make it easier to use. For context, we run our Redis clusters in a container scheduler (think Kubernetes, Hashicorp Nomad etc) which implies that individual nodes will move around a lot.

  1. Being able to specify more than a single entrypoint to the cluster would allow a two-tier deployment with individually scalable redis & proxy groups. Additionally, the proxy would not fail to start if the one node it was assigned to just moved/crashed. As part of this, it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)
  2. Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
  • If any nodes were added but I have healthy instances to talk to -> no action required
  • If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes
    That'd allow us to not restart the proxy whenever one of the allocation moves.
  1. The last one is a potential bug which I'm working on a solid repro case for: As part of cluster bootstrapping, we do a CLUSTER RESET HARD. If the proxy connected to the node prior to doing this, it will not pick up the nodes that we start learning about afterwards - The cluster can be healthy but requests sent to the proxy will fail. Take this with a grain of salt - a few assumptions here, still investigating this.

I'd be happy to send pull requests for any of the points above if they align with your vision for this project. Additionally, do you have a published roadmap/next planned work items so we could start contributing a bit? :)

segfault if "PROXY MULTIPLEXING" is issued without final arg

127.0.0.1:7777> PROXY MULTIPLEXING
Could not connect to Redis at 127.0.0.1:7777: Connection refused

Boom!

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-04-08 15:56:51.866/0] === ASSERTION FAILED ===
[2020-04-08 15:56:51.866/0] ==> proxy.c:969 'p < end' is not true
[2020-04-08 15:56:51.867/0] (forcing SIGSEGV to print the bug report.)
[2020-04-08 15:56:51.869/0] Thread 1 terminated
[2020-04-08 15:56:51.869/0] Thread 2 terminated
[2020-04-08 15:56:51.870/0] Thread 3 terminated
[2020-04-08 15:56:51.870/0] Thread 4 terminated
[2020-04-08 15:56:51.870/0] Thread 5 terminated
[2020-04-08 15:56:51.871/0] Thread 6 terminated
[2020-04-08 15:56:51.872/0] Thread 7 terminated
[2020-04-08 15:56:51.872/0] Redis Cluster Proxy 0.9.102 crashed by signal: 11
[2020-04-08 15:56:51.873/0] Crashed running the instruction at: 0x7f1f51010040
[2020-04-08 15:56:51.873/0] Accessing address: 0xffffffffffffffff
[2020-04-08 15:56:51.873/0] Handling crash on thread: 0
[2020-04-08 15:56:51.874/0] Failed assertion: p < end (proxy.c:969)

------ STACK TRACE ------
EIP:
./redis-cluster-proxy(_proxyAssert+0x70)[0x7f1f51010040]

Backtrace:
./redis-cluster-proxy(logStackTrace+0x44)[0x7f1f5100f5a4]
./redis-cluster-proxy(sigsegvHandler+0x1a0)[0x7f1f5100fd00]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f1f50442890]
./redis-cluster-proxy(_proxyAssert+0x70)[0x7f1f51010040]
./redis-cluster-proxy(proxyCommand+0x1972)[0x7f1f5101be52]
./redis-cluster-proxy(processRequest+0x347)[0x7f1f5101e1b7]
./redis-cluster-proxy(readQuery+0x21e)[0x7f1f5101f45e]
./redis-cluster-proxy(aeProcessEvents+0x14f)[0x7f1f5100aadf]
./redis-cluster-proxy(aeMain+0x2b)[0x7f1f5100aeeb]
./redis-cluster-proxy(+0x1395c)[0x7f1f5101395c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f1f504376db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f1f5015188f]

------ INFO OUTPUT ------

Proxy

proxy_version:0.9.102
proxy_git_sha1:00000000
proxy_git_dirty:0
proxy_git_branch:
os:Linux 4.4.0-19041-Microsoft x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:7.3.0
process_id:344
threads:8
tcp_port:7777
uptime_in_seconds:5255
uptime_in_days:0
config_file:
acl_user:default

Memory

used_memory:8333440
used_memory_human:7.95M
total_system_memory:68650504192
total_system_memory_human:63.94G

Clients

connected_clients:1
max_clients:10000
thread_0_clinets:1
thread_1_clinets:0
thread_2_clinets:0
thread_3_clinets:0
thread_4_clinets:0
thread_5_clinets:0
thread_6_clinets:0
thread_7_clinets:0

Cluster

address:
entry_node::0

---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64

------ REGISTERS ------

RAX:0000000000000000 RBX:00000000000003c9
RCX:0000000000000b40 RDX:0000000000000000
RDI:00007f1f5041c760 RSI:00007f1f5041d8c0
RBP:00007f1f5102f518 RSP:00007f1f5001fc90
R8 :00007f1f5041d8c0 R9 :00007f1f50020700
R10:00000000ffffffba R11:0000000000000000
R12:00007f1f5102f9d8 R13:00007f1f4403dc80
R14:0000000000000001 R15:00007f1f4403dc80
RIP:00007f1f51010040 EFL:0000000000010202
CSGSFS:00000053002b0033
(00007f1f5001fc9f) -> 00007fffe5808090
(00007f1f5001fc9e) -> 0000000000000000
(00007f1f5001fc9d) -> 18298df602aa7900
(00007f1f5001fc9c) -> 0000000000050200
(00007f1f5001fc9b) -> 00007f1f5102698f
(00007f1f5001fc9a) -> 0000000000000007
(00007f1f5001fc99) -> 0000000000000005
(00007f1f5001fc98) -> 0000000000000000
(00007f1f5001fc97) -> 0000000000000007
(00007f1f5001fc96) -> 000000764400a7d3
(00007f1f5001fc95) -> 0000000000008006
(00007f1f5001fc94) -> 0000000000000019
(00007f1f5001fc93) -> 00007f1f5101be52
(00007f1f5001fc92) -> 00007f1f5102fadb
(00007f1f5001fc91) -> 00007f1f44071e47
(00007f1f5001fc90) -> 00007f1f4403d0f1

------ DUMPING CODE AROUND EIP ------
Symbol: _proxyAssert (base: 0x7f1f5100ffd0)
Module: ./redis-cluster-proxy (base 0x7f1f51000000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x7f1f5100ffd0 -D -b binary -m i386:x86-64 /tmp/dump.bin

dump of function (hexdump of 240 bytes):
8b05cace220041544989fc554889f55389d385c07505e895f2ffff488d3581da0100bf0400000031c0e8e21a0000488d3587da01004d89e089d94889eabf0400000031c0e8c71a0000488d3590d70100bf0400000031c04c89256ace220048892d5bce2200891d4dcc2200e8a01a0000c60425ffffffff785b5d415cc30f1f0048b8feffffffffffff7f4154554839c65348bb0000000000000080771b4883fe04bb0400000076100f1f8400000000004801db4839de77f848395f18b80100000074394889fd488d3cdd000000004c8d63ffe80969010048837d1000742a4889453048895d3831c04c89654048c74548
Function at 0x7f1f51011ae0 is proxyLog
Function at 0x7f1f510269b0 is zcalloc

=== PROXY BUG REPORT END. Make sure to include from START to END. ===

Install a crash handler (SEGFAULT)

I'm getting a couple of restarts, and they don't seem to be related to high memory usage. Having a crash handler that display something on the console if a crash happen could help. I believe that redis has that.

redis-proxy-30-29wf5          1/1     Running       1          17m
redis-proxy-30-65zgp          1/1     Running       6          17m
redis-proxy-30-klq7z          1/1     Running       7          17m
redis-proxy-30-l42n6          1/1     Running       6          17m
redis-proxy-30-pfj8h          1/1     Running       5          17m
redis-proxy-30-q2bgk          1/1     Running       6          17m

Potential memory and connection leak in sendMessageToThread

For now, if pipe is not yet writable, function sendMessageToThread will register a write event and continue sending message in callback handlePendingAwakeMessages. Function handlePendingAwakeMessages will again call sendMessageToThread to do the work. This might cause some issues.

1, If error occurs in sendMessageToThread called by handlePendingAwakeMessages, object client being sent will not be freed, memory and connection will not be freed;

2, sds msg = ln->value; int sent = sendMessageToThread(thread, msg); if (sent == -1) continue; else { listDelNode(thread->pending_messages, ln); if (!sent) { proxyLogErr("Failed to send message to thread %d", thread->thread_id); } }
In function handlePendingAwakeMessages, it will not delete msg from pending_messages list if sendMessageToThread return -1. But actually in case returning -1, sendMessageToThread already add remaining buf to the tail of pending_messages list. So we have duplicate messages in the list pointing to the same content.

3, In some situations, multiple msgs being sent by sendMessageToThread might cross sent

Crash in onClusterNodeDisconnection

The version is the release 1.0-beta2.

I will update once I find the way to reproduce it.
Hope the traceback can help.

Crash log:

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-04-27 14:24:46.985/0] Redis Cluster Proxy 0.9.102 crashed by signal: 11
[2020-04-27 14:24:46.985/0] Crashed running the instruction at: 0x415756
[2020-04-27 14:24:46.985/0] Accessing address: (nil)
[2020-04-27 14:24:46.985/0] Handling crash on thread: 0


------ STACK TRACE ------
EIP:
./redis-cluster-proxy(onClusterNodeDisconnection+0x16)[0x415756]

Backtrace:
./redis-cluster-proxy(logStackTrace+0x2d)[0x40d1dd]
./redis-cluster-proxy(sigsegvHandler+0x186)[0x40d7f6]
/lib64/libpthread.so.0(+0xf5d0)[0x7f0fb9bc15d0]
./redis-cluster-proxy(onClusterNodeDisconnection+0x16)[0x415756]
./redis-cluster-proxy[0x40a5ab]
./redis-cluster-proxy[0x40a6a1]
./redis-cluster-proxy(resetCluster+0x3e)[0x40a7ce]
./redis-cluster-proxy(updateCluster+0x1d9)[0x40c4e9]
./redis-cluster-proxy[0x41a1f4]
./redis-cluster-proxy(aeProcessEvents+0x291)[0x408f21]
./redis-cluster-proxy(aeMain+0x2b)[0x40920b]
./redis-cluster-proxy[0x41107c]
/lib64/libpthread.so.0(+0x7dd5)[0x7f0fb9bb9dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f0fb98e302d]


------ INFO OUTPUT ------
# Proxy
proxy_version:0.9.102
proxy_git_sha1:00000000
proxy_git_dirty:0
proxy_git_branch:
os:Linux 3.10.0-957.21.3.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:7.3.1
process_id:30681
threads:1
tcp_port:7777
uptime_in_seconds:599
uptime_in_days:0
config_file:proxy.conf
acl_user:default

# Memory
used_memory:2177168
used_memory_human:2.08M
total_system_memory:16479350784
total_system_memory_human:15.35G

# Clients
connected_clients:50
max_clients:10000
thread_0_clinets:50

# Cluster
address:
entry_node::0


---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64


------ REGISTERS ------

RAX:00007f0fb4009340 RBX:00007f0fb40aa3c0
RCX:00007f0fb4008b00 RDX:0000000000000045
RDI:00007f0fb40aa3c0 RSI:00007f0fb97e3b90
RBP:0000000000000000 RSP:00007f0fb97e3b30
R8 :00007f0fb4009340 R9 :00007f0fb4008bc0
R10:00000000fffffc00 R11:00007f0fb9971f40
R12:00007f0fb4008c00 R13:7878787878787878
R14:00000000022d4850 R15:00007f0fb40090a0
RIP:0000000000415756 EFL:0000000000010206
CSGSFS:0000000000000033
(00007f0fb97e3b3f) -> 000000000040a7ce
(00007f0fb97e3b3e) -> 00000000022d4850
(00007f0fb97e3b3d) -> 0000000000000000
(00007f0fb97e3b3c) -> 0000000000000045
(00007f0fb97e3b3b) -> 000000000040a6a1
(00007f0fb97e3b3a) -> 00007f0fb4008c00
(00007f0fb97e3b39) -> 0000000000000000
(00007f0fb97e3b38) -> 00000000022d4850
(00007f0fb97e3b37) -> 000000000040a5ab
(00007f0fb97e3b36) -> 0000000000000001
(00007f0fb97e3b35) -> 00007f0fb4008c00
(00007f0fb97e3b34) -> 0000000000000000
(00007f0fb97e3b33) -> 00007f0fb40aa3c0
(00007f0fb97e3b32) -> 0000000000000001
(00007f0fb97e3b31) -> 00007f0fb4019930
(00007f0fb97e3b30) -> 00007f0fb40090a0


------ DUMPING CODE AROUND EIP ------
Symbol: onClusterNodeDisconnection (base: 0x415740)
Module: ./redis-cluster-proxy (base 0x400000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x415740 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
dump of function  (hexdump of 150 bytes):
4155415455534883ec184c8b2f4d85ed0f8490010000498b45004885c00f84830100008bb08400000085f60f8875010000488b4708488b151c4e22004889fd486308488b14ca4885d27445488b7a204885ff743cba03000000e8f232ffff488b450841c7451c000000004885c07529bfe2ba4200ba330c0000be60b74200e8cd82ffffbf01000000e89312ffff0f1f0041c7451c0000
Function at 0x408a90 is aeDeleteFileEvent
Function at 0x40da90 is _proxyAssert


=== PROXY BUG REPORT END. Make sure to include from START to END. ===

some issues with parent request might lead client blocked

Proxy will create child requests for some commands. For now, there might be some issues in processing request with child requests.

1, If error occurs while sending request to cluster node or receiving response from cluster node, request will be freed and an error msg will append to client buf. This is ok for request without child requests, but not for those parent requests. If a parent request is being freed, all the child request will being freed also. Client will loss some response, and min_reply_id will never being set to req->max_child_reply_id + 1. No matter what command client sent lately, it will not getting any response.

2, Proxy will reprocess requests when getting "MOVED" reply. It is not ok to reprocess request with command handleReply and not key command. It will do duplicateRequestForAllMasters again for every parent and child requests.

redis-cluster-proxy crashing

The version is the release 1.0-beta2.
The crash happened when the master to which redis-cluster-proxy connected and used as primary got suddenly unavailable.

crash log:
2020-05-12T09:01:41.809020738Z stdout F === PROXY BUG REPORT START: Cut & paste starting from here ===
2020-05-12T09:01:41.809037669Z stdout F [2020-05-12 09:01:41.808/0] Redis Cluster Proxy 0.9.102 crashed by signal: 11
2020-05-12T09:01:41.809054952Z stdout F [2020-05-12 09:01:41.808/0] Crashed running the instruction at: 0x408c84
2020-05-12T09:01:41.809076298Z stdout F [2020-05-12 09:01:41.808/0] Accessing address: 0x8
2020-05-12T09:01:41.809141611Z stdout F [2020-05-12 09:01:41.808/0] Handling crash on thread: 0
2020-05-12T09:01:41.809172261Z stdout F
2020-05-12T09:01:41.809186323Z stdout F
2020-05-12T09:01:41.809201293Z stdout F ------ STACK TRACE ------
2020-05-12T09:01:41.809246189Z stdout F EIP:
2020-05-12T09:01:41.809285105Z stdout F redis-cluster-proxy(listEmpty+0x24)[0x408c84]
2020-05-12T09:01:41.809299386Z stdout F
2020-05-12T09:01:41.80931411Z stdout F Backtrace:
2020-05-12T09:01:41.80932926Z stdout F redis-cluster-proxy(logStackTrace+0x2d)[0x40dc4d]
2020-05-12T09:01:41.809344458Z stdout F redis-cluster-proxy(sigsegvHandler+0x17a)[0x40e25a]
2020-05-12T09:01:41.80936098Z stdout F /lib64/libpthread.so.0(+0x132d0)[0x7f82fb59b2d0]
2020-05-12T09:01:41.809377083Z stdout F redis-cluster-proxy(listEmpty+0x24)[0x408c84]
2020-05-12T09:01:41.809399569Z stdout F redis-cluster-proxy(listRelease+0x9)[0x408cd9]
2020-05-12T09:01:41.809493698Z stdout F redis-cluster-proxy(resetCluster+0x36)[0x40b2b6]
2020-05-12T09:01:41.809521301Z stdout F redis-cluster-proxy(updateCluster+0x1ca)[0x40cf3a]
2020-05-12T09:01:41.809538809Z stdout F redis-cluster-proxy(proxyCommand+0x141f)[0x4193bf]
2020-05-12T09:01:41.809556442Z stdout F redis-cluster-proxy(processRequest+0x2d5)[0x41b7d5]
2020-05-12T09:01:41.809575586Z stdout F redis-cluster-proxy(readQuery+0x1ea)[0x41c96a]
2020-05-12T09:01:41.809610288Z stdout F redis-cluster-proxy(aeProcessEvents+0x101)[0x4098e1]
2020-05-12T09:01:41.809631791Z stdout F redis-cluster-proxy(aeMain+0x2b)[0x409cdb]
2020-05-12T09:01:41.809701201Z stdout F redis-cluster-proxy[0x411b8c]
2020-05-12T09:01:41.809738579Z stdout F /lib64/libpthread.so.0(+0x84f9)[0x7f82fb5904f9]
2020-05-12T09:01:41.809756341Z stdout F /lib64/libc.so.6(clone+0x3f)[0x7f82fb2c3f2f]
2020-05-12T09:01:41.8097703Z stdout F
2020-05-12T09:01:41.809782737Z stdout F
2020-05-12T09:01:41.809796559Z stdout F ------ INFO OUTPUT ------
2020-05-12T09:01:41.809846874Z stdout F # Proxy
2020-05-12T09:01:41.80988196Z stdout F proxy_version:0.9.102
2020-05-12T09:01:41.809896773Z stdout F proxy_git_sha1:00000000
2020-05-12T09:01:41.809910317Z stdout F proxy_git_dirty:0
2020-05-12T09:01:41.809927546Z stdout F proxy_git_branch:
2020-05-12T09:01:41.809947738Z stdout F os:Linux 4.15.0-99-generic x86_64
2020-05-12T09:01:41.809968636Z stdout F arch_bits:64
2020-05-12T09:01:41.810042283Z stdout F multiplexing_api:epoll
2020-05-12T09:01:41.810063659Z stdout F gcc_version:8.2.1
2020-05-12T09:01:41.810083387Z stdout F process_id:31
2020-05-12T09:01:41.810103715Z stdout F threads:8
2020-05-12T09:01:41.810122553Z stdout F tcp_port:7777
2020-05-12T09:01:41.810140832Z stdout F uptime_in_seconds:3780
2020-05-12T09:01:41.810176828Z stdout F uptime_in_days:0
2020-05-12T09:01:41.810192157Z stdout F config_file:
2020-05-12T09:01:41.810206659Z stdout F acl_user:default
2020-05-12T09:01:41.810219165Z stdout F
2020-05-12T09:01:41.810234292Z stdout F # Memory
2020-05-12T09:01:41.81024989Z stdout F used_memory:8189696
2020-05-12T09:01:41.810267234Z stdout F used_memory_human:7.81M
2020-05-12T09:01:41.810287134Z stdout F total_system_memory:33728958464
2020-05-12T09:01:41.810307505Z stdout F total_system_memory_human:31.41G
2020-05-12T09:01:41.81032145Z stdout F
2020-05-12T09:01:41.810338765Z stdout F # Clients
2020-05-12T09:01:41.810358186Z stdout F connected_clients:1
2020-05-12T09:01:41.810379253Z stdout F max_clients:10000
2020-05-12T09:01:41.810407401Z stdout F thread_0_clinets:1
2020-05-12T09:01:41.81042417Z stdout F thread_1_clinets:0
2020-05-12T09:01:41.810438885Z stdout F thread_2_clinets:0
2020-05-12T09:01:41.810452562Z stdout F thread_3_clinets:0
2020-05-12T09:01:41.810468214Z stdout F thread_4_clinets:0
2020-05-12T09:01:41.810492261Z stdout F thread_5_clinets:0
2020-05-12T09:01:41.810506579Z stdout F thread_6_clinets:0
2020-05-12T09:01:41.810521053Z stdout F thread_7_clinets:0
2020-05-12T09:01:41.810533371Z stdout F
2020-05-12T09:01:41.810547706Z stdout F # Cluster
2020-05-12T09:01:41.810563877Z stdout F address:
2020-05-12T09:01:41.810583278Z stdout F entry_node::0
2020-05-12T09:01:41.810610768Z stdout F
2020-05-12T09:01:41.810623832Z stdout F
2020-05-12T09:01:41.810639479Z stdout F ---- SIZEOF STRUCTS ----
2020-05-12T09:01:41.810653969Z stdout F clientRequest: 184
2020-05-12T09:01:41.810668392Z stdout F client: 224
2020-05-12T09:01:41.810691151Z stdout F redisClusterConnection: 48
2020-05-12T09:01:41.810715281Z stdout F clusterNode: 112
2020-05-12T09:01:41.810731337Z stdout F redisCluster: 104
2020-05-12T09:01:41.810744585Z stdout F list: 48
2020-05-12T09:01:41.81075805Z stdout F listNode: 24
2020-05-12T09:01:41.810780879Z stdout F rax: 24
2020-05-12T09:01:41.810797111Z stdout F raxNode: 4
2020-05-12T09:01:41.810811536Z stdout F raxIterator: 480
2020-05-12T09:01:41.810824488Z stdout F aeEventLoop: 88
2020-05-12T09:01:41.810838539Z stdout F aeFileEvent: 32
2020-05-12T09:01:41.810852474Z stdout F aeTimeEvent: 64
2020-05-12T09:01:41.810867864Z stdout F
2020-05-12T09:01:41.810886717Z stdout F
2020-05-12T09:01:41.810905355Z stdout F ------ REGISTERS ------
2020-05-12T09:01:41.810921757Z stdout F
2020-05-12T09:01:41.810941988Z stdout F RAX:0000000000000025 RBX:0000000000000000
2020-05-12T09:01:41.810960358Z stdout F RCX:0000000000000000 RDX:0000000002545550
2020-05-12T09:01:41.810994563Z stdout F RDI:00007f82f000bad0 RSI:00007f82fb57caa8
2020-05-12T09:01:41.811009414Z stdout F RBP:00007f82f00186ff RSP:00007f82fb1c6a80
2020-05-12T09:01:41.811025552Z stdout F R8 :00000000024e1d80 R9 :00007f82fb1c7700
2020-05-12T09:01:41.811039822Z stdout F R10:6e6972756769666e R11:0000000000000246
2020-05-12T09:01:41.81105407Z stdout F R12:0000000000000006 R13:00007f82f000bad0
2020-05-12T09:01:41.811071411Z stdout F R14:00007f82f0012763 R15:00007f82f0018d60
2020-05-12T09:01:41.811092855Z stdout F RIP:0000000000408c84 EFL:0000000000010206
2020-05-12T09:01:41.811109636Z stdout F CSGSFS:002b000000000033
2020-05-12T09:01:41.81112797Z stdout F (00007f82fb1c6a8f) -> 000000000041eaf8
2020-05-12T09:01:41.811147177Z stdout F (00007f82fb1c6a8e) -> 0000000000000000
2020-05-12T09:01:41.811239222Z stdout F (00007f82fb1c6a8d) -> 00007f8200000000
2020-05-12T09:01:41.811273634Z stdout F (00007f82fb1c6a8c) -> 0000000000000000
2020-05-12T09:01:41.811289104Z stdout F (00007f82fb1c6a8b) -> 00007f82f0015730
2020-05-12T09:01:41.811302989Z stdout F (00007f82fb1c6a8a) -> 00007f82000018eb
2020-05-12T09:01:41.811332337Z stdout F (00007f82fb1c6a89) -> 000000000040cf3a
2020-05-12T09:01:41.811350356Z stdout F (00007f82fb1c6a88) -> 00007f82f00157f0
2020-05-12T09:01:41.811364233Z stdout F (00007f82fb1c6a87) -> 000000000040b2b6
2020-05-12T09:01:41.811378488Z stdout F (00007f82fb1c6a86) -> 00000000024e39e0
2020-05-12T09:01:41.811427156Z stdout F (00007f82fb1c6a85) -> 0000000000408cd9
2020-05-12T09:01:41.811453613Z stdout F (00007f82fb1c6a84) -> 00000000024e39e0
2020-05-12T09:01:41.811468478Z stdout F (00007f82fb1c6a83) -> 0000000000000006
2020-05-12T09:01:41.81148219Z stdout F (00007f82fb1c6a82) -> 0000000000000000
2020-05-12T09:01:41.811496113Z stdout F (00007f82fb1c6a81) -> 00007f82f000bad0
2020-05-12T09:01:41.811509149Z stdout F (00007f82fb1c6a80) -> 00000000024e1d80
2020-05-12T09:01:41.811522359Z stdout F
2020-05-12T09:01:41.811537515Z stdout F
2020-05-12T09:01:41.811555272Z stdout F ------ DUMPING CODE AROUND EIP ------
2020-05-12T09:01:41.811575011Z stdout F Symbol: listEmpty (base: 0x408c60)
2020-05-12T09:01:41.811594552Z stdout F Module: redis-cluster-proxy (base 0x400000)
2020-05-12T09:01:41.811613902Z stdout F $ xxd -r -p /tmp/dump.hex /tmp/dump.bin
2020-05-12T09:01:41.811631723Z stdout F $ objdump --adjust-vma=0x408c60 -D -b binary -m i386:x86-64 /tmp/dump.bin
2020-05-12T09:01:41.811649746Z stdout F ------
2020-05-12T09:01:41.811668888Z stdout F dump of function (hexdump of 164 bytes):
2020-05-12T09:01:41.811701156Z stdout F 41554989fd415455534883ec08488b4728488b1f4885c0742f488d68ff0f1f00498b45184c8b63084885c07406488b7b10ffd04889df4883ed014c89e3e8deae01004883fdff75d849c745080000000049c745000000000049c74528000000004883c4085b5d415c415dc30f1f440000534889fbe887ffffff4889df5be99eae010066662e0f1f8400000000000f1f00554889f5534889fbbf180000004883ec08e86aad
2020-05-12T09:01:41.811753959Z stdout F Function at 0x423b80 is zfree
2020-05-12T09:01:41.811824796Z stdout F Function at 0x408c60 is listEmpty
2020-05-12T09:01:41.811838776Z stdout F
2020-05-12T09:01:41.81185604Z stdout F
2020-05-12T09:01:41.81187299Z stdout F === PROXY BUG REPORT END. Make sure to include from START to END. ===
2020-05-12T09:01:41.811891451Z stdout F
2020-05-12T09:01:41.811912467Z stdout F Please report the crash by opening an issue on github:
2020-05-12T09:01:41.81193608Z stdout F
2020-05-12T09:01:41.811960627Z stdout F https://github.com/artix75/redis-cluster-proxy/issues
2020-05-12T09:01:41.811978477Z stdout F
2020-05-12T09:01:42.291611403Z stderr F /usr/local/bin/start_redis_proxy.sh: line 130: 31 Segmentation fault (core dumped) redis-cluster-proxy

(minor / aesthetic) inline help does display big text for --port and --bind options

(venv) redis-cluster-proxy$ redis-cluster-proxy --help
Redis Cluster Proxy v999.999.999 (unstable)
Usage: redis-cluster-proxy [OPTIONS] [cluster_host:cluster_port]
  -c <file>            Configuration file
  -p, --port <port>    Port (default: 7777). Use 0 in order to disable                        TCP connections at all
  --max-clients <n>    Max clients (default: 10000)
  --threads <n>        Thread number (default: 8, max: 500)
  --tcpkeepalive       TCP Keep Alive (default: 300)
  --tcp-backlog        TCP Backlog (default: 511)
  --daemonize          Execute the proxy in background
  --unixsocket <sock_file>     UNIX socket path (empty by default)
  --unixsocketperm <mode>      UNIX socket permissions (default: 0)
  --bind <address>     Bind an interface (can be used multiple times                        to bind multiple interfaces)
  --disable-multiplexing <opt> When should multiplexing disabled
                               (never|auto|always) (default: auto)
  --enable-cross-slot  Enable cross-slot queries (warning: cross-slot
                       queries routed to multiple nodes cannot be atomic).
  -a, --auth <passw>   Authentication password
  --auth-user <name>   Authentication username
  --disable-colors     Disable colorized output
  --log-level <level>  Minimum log level: (default: info)
                       (debug|info|success|warning|error)
  --dump-queries       Dump query args (only for log-level 'debug') 
  --dump-buffer        Dump query buffer (only for log-level 'debug') 
  --dump-queues        Dump request queues (only for log-level 'debug') 
  -h, --help         Print this help

There's a bunch of space for the --bind and --port options. If we compare it to --log-level and --enable-cross-slot options which are tidier / more compact. This has to be a simple extra 'space' formatting thing.

Proxy does not detect failover if old master dies quickly

I managed to get the proxy to fail consistently with this repro case:

  • Send a CLUSTER FAILOVER command to a slave
  • Kill the old master that we fell away from after a successful failover
  • Proxy will continue to hammer the address of the old master because it did not get a MOVED response (which would cause a reconfiguration). This state goes on forever.

I haven't tested the "master fails w/o a successful failover before" case but it seems likely that that'd cause the same behaviour.

A possible solution here might be to trigger a reconfiguration whenever we loose a connection to a master?

Getting 'Could not create read handler' errors

My cluster seems ok, no restarts of any nodes in the cluster. I have enabled the 'multiple-endpoint' features.

[2020-03-24 05:25:21.975/5] Populate connection pool: failed to install write handler for node 172.26.32.220:6379
[2020-03-24 05:25:21.976/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:21.976/5] Populate connection pool: failed to install write handler for node 172.25.145.138:6379
[2020-03-24 05:25:21.976/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:21.976/5] Populate connection pool: failed to install write handler for node 172.27.86.24:6379
[2020-03-24 05:25:21.976/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:21.976/5] Populate connection pool: failed to install write handler for node 172.27.36.226:6379
[2020-03-24 05:25:22.001/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:22.001/5] ERROR: Failed to create read query handler for client 5:762 from 172.26.199.114:57978
[2020-03-24 05:25:54.529/6] Could not create read handler: No such file or directory
[2020-03-24 05:25:54.529/6] ERROR: Failed to create read query handler for client 6:904 from 172.25.212.157:36710

I am at this commit:

commit 6751bf515fcef0a46c273f0199e49794592529ec (origin/unstable, origin/HEAD)
Author: artix <[email protected]>
Date:   Wed Mar 18 16:49:26 2020 +0100

    Use exit code 1 when test fails (Fix #46)

Running multiple instances of the proxy (question)

My thinking is that if one proxy dies, we want redis-client to be able to talk to another instance, and so have multiple one running at the same time (trying to have 4 now).

Is it possible to do load balancing and have multiple redis-proxy running, or is it problematic (I hope it's not !) ?

proxy.h:79:5: err:unknow type ‘_Atomic’

GCC version is 6.1.0, on unstable branch.
Running a make, Error occurred:

In file included from cluster.c:31:0:
proxy.h:79:5: err:unknow type ‘_Atomic’
_Atomic uint64_t numclients;
^
proxy.h:79:22: err:expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘attribute’ before ‘numclients’
_Atomic uint64_t numclients;
^

Crash from `proxy cluster update`

To reproduce:

  1. Start 6 node cluster: 3 master, 3 slave.
  2. Create the cluster using redis-cli --cluster create. Default options.
  3. Start redis-cluster-proxy. Default options.
  4. Connect to proxy via redis-cli.
  5. Run the following series of commands:
  • proxy cluster info
  • proxy cluster update
  • proxy cluster info
  • proxy cluster update.

This will crash the proxy, with the following error message:

redis-cluster-proxy(77503,0x700009d67000) malloc: *** error for object 0x7fa03fa0f920: pointer being freed was not allocated
redis-cluster-proxy(77503,0x700009d67000) malloc: *** set a breakpoint in malloc_error_break to debug

Note that the second proxy cluster info command does not return the expected results.

If you start fresh again and run the following in sequence:

  • proxy cluster info
  • proxy cluster update
  • proxy cluster update

You get a different, but seemingly related crash. The first crash does not produce a bug report, but here is the report for the second sequence:

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-06-02 14:18:48.539/0] Redis Cluster Proxy 999.999.999 crashed by signal: 11
[2020-06-02 14:18:48.539/0] Crashed running the instruction at: 0x10476f0b3
[2020-06-02 14:18:48.539/0] Accessing address: 0x0
[2020-06-02 14:18:48.539/0] Handling crash on thread: 0


------ STACK TRACE ------
EIP:
0   redis-cluster-proxy                 0x000000010476f0b3 listRelease + 35

Backtrace:
0   redis-cluster-proxy                 0x0000000104774c22 logStackTrace + 114
1   redis-cluster-proxy                 0x000000010477501f sigsegvHandler + 575
2   libsystem_platform.dylib            0x00007fff6ee215fd _sigtramp + 29
3   ???                                 0x0000000000000000 0x0 + 0
4   redis-cluster-proxy                 0x0000000104772320 resetCluster + 64
5   redis-cluster-proxy                 0x0000000104773a99 updateCluster + 697
6   redis-cluster-proxy                 0x000000010477c81a proxyCommand + 4122
7   redis-cluster-proxy                 0x0000000104780e5d processRequest + 989
8   redis-cluster-proxy                 0x0000000104782c3e readQuery + 494
9   redis-cluster-proxy                 0x00000001047700d8 aeProcessEvents + 728
10  redis-cluster-proxy                 0x000000010477041b aeMain + 43
11  redis-cluster-proxy                 0x00000001047865d4 execProxyThread + 52
12  libsystem_pthread.dylib             0x00007fff6ee2d109 _pthread_start + 148
13  libsystem_pthread.dylib             0x00007fff6ee28b8b thread_start + 15


------ INFO OUTPUT ------
# Proxy
proxy_version:999.999.999
proxy_git_sha1:ac83840d
proxy_git_dirty:0
proxy_git_branch:unstable
os:Darwin 19.4.0 x86_64
arch_bits:64
multiplexing_api:kqueue
gcc_version:4.2.1
process_id:77962
threads:8
tcp_port:7777
uptime_in_seconds:9
uptime_in_days:0
config_file:./proxy.conf
acl_user:default

# Memory
used_memory:9540368
used_memory_human:9.10M
total_system_memory:17179869184
total_system_memory_human:16.00G

# Clients
connected_clients:1
max_clients:10000
thread_0_clinets:1
thread_1_clinets:0
thread_2_clinets:0
thread_3_clinets:0
thread_4_clinets:0
thread_5_clinets:0
thread_6_clinets:0
thread_7_clinets:0

# Cluster
address:
entry_node::0


---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64


------ REGISTERS ------

RAX:0000000000000b03 RBX:3532313366012828
RCX:0000000000000000 RDX:00000000000ecbb0
RDI:00007fee3040f060 RSI:00007fee32900000
RBP:0000700001b72ae0 RSP:0000700001b72ac0
R8 :0000000000000005 R9 :0000000000000001
R10:00007fee32900000 R11:00007fee329043e0
R12:00007fee32913340 R13:0000000000000006
R14:00007fee3040f060 R15:0000000000636432
RIP:000000010476f0b3 EFL:0000000000010202
CS :000000000000002b FS:0000000000000000  GS:0000000000000000
(0000700001b72acf) -> 0000000000000000
(0000700001b72ace) -> 0000000000000000
(0000700001b72acd) -> 0000000000000000
(0000700001b72acc) -> 0000000000000000
(0000700001b72acb) -> 0000000104773a99
(0000700001b72aca) -> 0000700001b72d70
(0000700001b72ac9) -> 00007fee329043e3
(0000700001b72ac8) -> 0000000000000000
(0000700001b72ac7) -> 0000000000000006
(0000700001b72ac6) -> 00007fee32913340
(0000700001b72ac5) -> 0000000104772320
(0000700001b72ac4) -> 0000700001b72b10
(0000700001b72ac3) -> 00007fee30706bf0
(0000700001b72ac2) -> 00007fee30706bf0
(0000700001b72ac1) -> 00007fee32913340
(0000700001b72ac0) -> 0000000000000000


------ DUMPING CODE AROUND EIP ------
Symbol: listRelease (base: 0x10476f090)
Module: /usr/local/bin/redis-cluster-proxy (base 0x10476e000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x10476f090 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
dump of function  (hexdump of 163 bytes):
554889e5415741564154534989fe4c8b7f284d85ff742f498b1e660f1f44000049ffcf4c8b6308498b46184885c07406488b7b10ffd04889dfe8f2ea01004c89e34d85ff75da49c746280000000049c746080000000049c706000000004c89f75b415c415e415f5de9c3ea01000f1f00554889e54156534989f64889fbbf18000000e889e901004885c074234c897010488b4b284885c9741a48c70000000000488b13
Function at 0x10478dbc0 is zfree
Function at 0x10478daa0 is zmalloc


=== PROXY BUG REPORT END. Make sure to include from START to END. ===

Entry node is single point of failure

When the Redis entrynode dies, the proxy will break when the next updateCluster invocation comes around because it can not talk to it's entry node anymore, causing cluster->broken to be set which just causes every new request to be replied to with an error. This disturbs consumer traffic when not actually needed - The cluster can still be fully healthy if the entry node happened to be a slave or a master from which all slots were migrated away.

I'd like to propose for the proxy to, on a failed configuration fetch from it's entry node, to try all (or a subset?) of all other nodes that it was aware of. This can take a non-trivial time but it seems preferable to just fully bricking the proxy.

A possible implementation might be to save ip+ports before reseting the cluster and trying them one by one until a valid configuration is fetched here: https://github.com/artix75/redis-cluster-proxy/blob/unstable/src/cluster.c#L827

Happy to send a PR but would like to get feedback on the approach first.

Note: Might be similar to #8, maybe there's a solution which solves both cases?

Proxy returns MOVED responses

I have noticed that the proxy appears to return MOVED responses, therefore not hiding slot movements properly. Looking at the code, this appears to be confined to transactions currently (https://github.com/artix75/redis-cluster-proxy/blob/unstable/src/proxy.c#L4225) but I may be missing other parts of it.

In our case, this lead to our 'smart' client switching to cluster mode and trying to bypass the proxy. Solutions I can see:

  1. Retry transactions(?)
  2. Rewrite error messages to not follow standard MOVED format. i.e. replace with cancelled due to cluster topology change

Q: Works for sentinel?

Hello,

Firstly, congrats on Redis 6 release👍

Today, my org is using Sentinel and was wondering if this proxy works for sentinel as well, or if there exists such a proxy?

Reason we use Sentinel is our use case is for HA/Failover only, no need for horizontal sharding at this time.

Thoughts on a minimal "info" command

Some clients use the info command for discovery purposes, looking for things like the redis server version (to select strategies such as del vs unlink, or other changes between versions).

What is the chance of a minimal info [section] command, even if it only had something like:

# Server
redis_version:5.0.4
redis_mode:proxy

? this would allow such clients to continue to make those decisions appropriately based on the underlying redis server (not the proxy) version. The reported server could be the minimum over the cluster, or could just be the version from the primary node used in the configuration.

(side note: is there any reason time can't be implemented using the time at the proxy?)

Crash when using BLPOP and MGET

Version

unstable branch
commit 6751bf5

Steps to Reproduce

> redis-cli -p 7777
127.0.0.1:7777> PROXY CONFIG SET enable-cross-slot 1
OK
127.0.0.1:7777> blpop a b 0
(error) ERR Cross-slot queries are not supported for this command
127.0.0.1:7777> mget a b
^C
Wait 5 seconds and type Ctrl-C to stop the redis-client could sometimes trigger proxy crash.

Crash Log:

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-03-24 15:16:55.273/0] Redis Cluster Proxy 999.999.999 crashed by signal: 11
[2020-03-24 15:16:55.273/0] Crashed running the instruction at: 0x10fbd05c5
[2020-03-24 15:16:55.273/0] Accessing address: 0x0
[2020-03-24 15:16:55.273/0] Handling crash on thread: 0


------ STACK TRACE ------
EIP:
0   redis-cluster-proxy                 0x000000010fbd05c5 listNext + 21

Backtrace:
0   redis-cluster-proxy                 0x000000010fbd5fc2 logStackTrace + 114
1   redis-cluster-proxy                 0x000000010fbd63c1 sigsegvHandler + 577
2   libsystem_platform.dylib            0x00007fff74483b5d _sigtramp + 29
3   ???                                 0x000000000000ffff 0x0 + 65535
4   redis-cluster-proxy                 0x000000010fbe1618 freeClient + 472
5   redis-cluster-proxy                 0x000000010fbd1328 aeProcessEvents + 744
6   redis-cluster-proxy                 0x000000010fbd166b aeMain + 43
7   redis-cluster-proxy                 0x000000010fbe7914 execProxyThread + 52
8   libsystem_pthread.dylib             0x00007fff7448c2eb _pthread_body + 126
9   libsystem_pthread.dylib             0x00007fff7448f249 _pthread_start + 66
10  libsystem_pthread.dylib             0x00007fff7448b40d thread_start + 13


------ INFO OUTPUT ------
# Proxy
proxy_version:999.999.999
proxy_git_sha1:6751bf51
proxy_git_dirty:0
proxy_git_branch:unstable
os:Darwin 18.7.0 x86_64
arch_bits:64
multiplexing_api:kqueue
gcc_version:4.2.1
process_id:98168
threads:1
tcp_port:7777
uptime_in_seconds:10
uptime_in_days:0
config_file:
acl_user:default

# Memory
used_memory:899664
used_memory_human:878.58K
total_system_memory:8589934592
total_system_memory_human:8.00G

# Clients
connected_clients:0
max_clients:10000

# Cluster
address:
entry_node::0


---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64


------ REGISTERS ------

RAX:d000000000000000 RBX:00007fa5f0f01d20
RCX:0000000000000001 RDX:00000000000f66d0
RDI:000070000280fe20 RSI:00007fa5f0c00000
RBP:000070000280fe10 RSP:000070000280fe10
R8 :0000000000000003 R9 :0000000000000000
R10:0000000000000004 R11:0000000000000004
R12:00007fa5f0d00e60 R13:00007fa5f0f00c6b
R14:0000000000000001 R15:000070000280fe20
RIP:000000010fbd05c5 EFL:0000000000010246
CS :000000000000002b FS:0000000000000000  GS:0000000000000000
(000070000280fe1f) -> 0000000b00000001
(000070000280fe1e) -> 0000000000000014
(000070000280fe1d) -> 0000000000000001
(000070000280fe1c) -> 00007fa5f0f00270
(000070000280fe1b) -> 000000010fbd1328
(000070000280fe1a) -> 000070000280fed0
(000070000280fe19) -> 0000000000000000
(000070000280fe18) -> 0000000000000001
(000070000280fe17) -> 00007fa5f0f00270
(000070000280fe16) -> 0000000000000001
(000070000280fe15) -> 000000010fc4c000
(000070000280fe14) -> 00007fa5f0f00270
(000070000280fe13) -> 0000000000000000
(000070000280fe12) -> d000000000000000
(000070000280fe11) -> 000000010fbe1618
(000070000280fe10) -> 000070000280fe60


------ DUMPING CODE AROUND EIP ------
Symbol: listNext (base: 0x10fbd05b0)
Module: /xxxxxx/redis-cluster-proxy/./src/redis-cluster-proxy (base 0x10fbcf000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x10fbd05b0 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
dump of function  (hexdump of 149 bytes):
554889e5488b074885c0741031c9837f08000f94c1488b0cc848890f5dc36690554889e5415741564154534889fbbf30000000e838ea01004885c00f84840100004989c648c740280000000048c740200000000048c740180000000048c740100000000048c740080000000048c70000000000f30f6f4310f30f7f4010488b4320498946204c8b3b4d85ff0f843701000066480f7e
Function at 0x10fbef020 is zmalloc


=== PROXY BUG REPORT END. Make sure to include from START to END. ===

       Please report the crash by opening an issue on github:

           https://github.com/artix75/redis-cluster-proxy/issues

[1]    98168 segmentation fault  ./src/redis-cluster-proxy --threads 1 127.0.0.1:8899

the proxy.conf willl surport the requirepass parameter?

I have try to set the auth parameter , and it works.
But I don't want anyone use my redis-cluster through redis-cluster-proxy without password.
Will it support the requirepass parameter in porxy.conf to let me set up a password for redis-cluster-proxy ?

ubuntu@km2:~$ redis-cli -h km1 -p 7777
km1:7777> set aa bb
OK
km1:7777> keys aa
1) "aa"
km1:7777> get aa
"bb"

Thoughts on supporting MONITOR command

That command is super useful as a low key way to figure out what's going on in redis. I wonder how hard it would be to support.

Maybe it could have parameter if we want to monitor a special node, but without params it would just list every commands received by the proxy. Not sure how to do that in scalable way, as synchronization will be certainly required.

Add dockerfiles ?

Here's a dockerfile to build this project. I don't know if you'd be interested in using it, or if you're familiar with those things. I find docker very convenient.

I noticed that redis source code doesn't have one, so if you want to stick to redis practices it probably make sense, just sharing those bits in case it's helpful. Those blips could go in a 'contrib' folder too, or go nowhere and stay on my branch :)

FROM alpine:3.11 as build

RUN apk add --no-cache gcc musl-dev linux-headers openssl-dev make

RUN addgroup -S app && adduser -S -G app app 
RUN chown -R app:app /opt
RUN chown -R app:app /usr/local

# There is a bug in CMake where we cannot build from the root top folder
# So we build from /opt
COPY --chown=app:app . /opt
WORKDIR /opt

USER app
RUN [ "make", "install" ]

FROM alpine:3.11 as runtime

RUN apk add --no-cache libstdc++
RUN apk add --no-cache strace
RUN apk add --no-cache python3
RUN apk add --no-cache redis

RUN addgroup -S app && adduser -S -G app app 
COPY --chown=app:app --from=build /usr/local/bin/redis-cluster-proxy /usr/local/bin/redis-cluster-proxy
RUN chmod +x /usr/local/bin/redis-cluster-proxy
RUN ldd /usr/local/bin/redis-cluster-proxy

# Copy source code for gcc
COPY --chown=app:app --from=build /opt /opt

# Now run in usermode
USER app
WORKDIR /home/app

ENTRYPOINT ["/usr/local/bin/redis-cluster-proxy"]
EXPOSE 7777
CMD ["redis-cluster-proxy"]

Separate makefile I use. It could be simplified and merged into the existing one.

.PHONY: docker

NAME   := ${DOCKER_REPO}/redis-cluster-proxy
TAG    := $(shell cat src/version.h | cut -d ' ' -f 3 | tr -d \")
IMG    := ${NAME}:${TAG}
LATEST := ${NAME}:latest
PROD   := ${NAME}:production
BUILD  := ${NAME}:build

docker_test:
	docker build -t ${BUILD} .

docker:
	git clean -dfx
	docker build -t ${IMG} .
	docker tag ${IMG} ${BUILD}

docker_push:
	docker tag ${IMG} ${PROD}
	docker push ${PROD}
	oc import-image redis-proxy:production # that thing is to trigger a deploy with openshift.

redis-cluster-proxy fails to AUTH when reopening a connection to the cluster

When redis-cluster-proxy (r-c-p for short) is connected to the cluster using authentication, it fails to AUTH to the cluster when the connections get recycled.

Instance one (easy to reproduce): r-c-p uses AUTH, clients do not. Connect r-c-p to a simple 3-instance redis cluster, all masters. When opening the initial connections to the cluster, r-c-p issues AUTH and everything works. Now restart ("service redis restart" or similar) any of the 3 redis masters. R-c-p will properly reconnect to the cluster, however it will not issue AUTH and all commands coming from r-c-ps clients will be unauthenticated and thus fail if using ACL.

Instance two (tedious to reproduce): r-c-p uses AUTH, clients do not. Connect r-c-p like before. Simply wait as connections will get eventually recycled. When they do, r-c-p will not issue AUTH on the new connections, thus again failing its clients.

A similar but far worse bug exists when clients use AUTH themselves. I will open a separate bug report for that as it is probably going to be much more difficult to fix.

Restarting r-c-p fixes the issue, but it is obvious that idea has no chance to survive in production.

On the network level, the problem is reproduced both when r-c-p receives a FIN packet from the redis server (and then they perform an orderly connection shutdown) and when it receives a RST packet.

Redis-cluster-proxy used in this case is built from git commit ac83840 on Ubuntu-20.04.

Getting error 'Could not create read handler: No such file or directory'

I am trying to put the proxy in front of a bunch of traffic and I'm getting those errors:

[2020-01-27 22:21:15.736] Could not create read handler: No such file or directory
[2020-01-27 22:21:15.736] Failed to create write handler for request
[2020-01-27 22:21:15.738] Could not create read handler: No such file or directory
[2020-01-27 22:21:15.738] Failed to create write handler for request
[2020-01-27 22:21:15.739] Could not create read handler: No such file or directory

I am using the connection_pool branch, and I have merged all commits from the unstable branch as of Mon Jan 27 14:24:09 PST 2020. I use the default config, no special args.

Redis Module supported?

Hello,
I love the idea of your proxy, thanks!
Is the Redis modules are supported ? (eg: RediSearch)

Crash 12 threads with 9 masters and 9 replicas

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-02-03 08:37:36.040] Redis Cluster Prxoy 999.999.999 crashed by signal: 11
[2020-02-03 08:37:36.040] Crashed running the instruction at: 0x55ad33061676
[2020-02-03 08:37:36.040] Accessing address: 0x7fee8f2115f0
[2020-02-03 08:37:36.040] Handling crash on thread: 2

------ STACK TRACE ------
EIP:
/usr/local/bin/redis-cluster-proxy(aeProcessEvents+0x156)[0x55ad33061676]

Backtrace:
/usr/local/bin/redis-cluster-proxy(logStackTrace+0x44)[0x55ad33065af4]
/usr/local/bin/redis-cluster-proxy(sigsegvHandler+0xed)[0x55ad3306619d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7fee8e868890]
/usr/local/bin/redis-cluster-proxy(aeProcessEvents+0x156)[0x55ad33061676]
/usr/local/bin/redis-cluster-proxy(aeMain+0x2b)[0x55ad33061a5b]
/usr/local/bin/redis-cluster-proxy(+0x12b1d)[0x55ad33069b1d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7fee8e85d6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fee8e58688f]

------ INFO OUTPUT ------

Proxy

proxy_version:999.999.999
proxy_git_sha1:f8dc227a
proxy_git_dirty:0
os:Linux 4.15.0-34-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:7.4.0
process_id:15625
threads:12
tcp_port:6363
uptime_in_seconds:13
uptime_in_days:0
config_file:
acl_user:default

Memory

used_memory:169477384
used_memory_human:161.63M
total_system_memory:67469664256
total_system_memory_human:62.84G

Clients

connected_clients:4059

Cluster

address:94.130.70.112
entry_node:94.130.70.112:6379

------ REGISTERS ------

RAX:0000000000000000 RBX:0000000000000001
RCX:000055ad33079c30 RDX:000055ad33085f60
RDI:000055ad33085f4c RSI:0000000000000008
RBP:00007fee8f2115f0 RSP:00007fee8d461e70
R8 :000055ad3472c603 R9 :000000000000aeb0
R10:00007fee8d460d8c R11:0000000000000246
R12:000055ad33b916d0 R13:0000000000000000
R14:0000000000000001 R15:000000000000002f
RIP:000055ad33061676 EFL:0000000000010246
CSGSFS:002b000000000033
(00007fee8d461e7f) -> 000055ad33069b1d
(00007fee8d461e7e) -> 000055ad33a4beb0
(00007fee8d461e7d) -> 000055ad33061a5b
(00007fee8d461e7c) -> 00007ffc995a9e40
(00007fee8d461e7b) -> 000055ad33a4beb0
(00007fee8d461e7a) -> 0000000000000000
(00007fee8d461e79) -> 00007fee8d461fc0
(00007fee8d461e78) -> 0000000000000000
(00007fee8d461e77) -> 000055ad33b916d0
(00007fee8d461e76) -> 0000000000000000
(00007fee8d461e75) -> a2087e4bebd11200
(00007fee8d461e74) -> 0000000000000000
(00007fee8d461e73) -> 000055ad33071a9e
(00007fee8d461e72) -> 00007ffc995a9e40
(00007fee8d461e71) -> 000055ad0000000b
(00007fee8d461e70) -> 000055ad33b916d0

------ FAST MEMORY TEST ------
*** Preparing to test memory region 55ad33290000 (4096 bytes)
*** Preparing to test memory region 55ad337c3000 (22859776 bytes)
*** Preparing to test memory region 7fee50000000 (12791808 bytes)
*** Preparing to test memory region 7fee58000000 (12701696 bytes)
*** Preparing to test memory region 7fee60000000 (12677120 bytes)
*** Preparing to test memory region 7fee64000000 (12787712 bytes)
*** Preparing to test memory region 7fee68000000 (12652544 bytes)
*** Preparing to test memory region 7fee6c000000 (12603392 bytes)
*** Preparing to test memory region 7fee70000000 (12726272 bytes)
*** Preparing to test memory region 7fee74000000 (12722176 bytes)
*** Preparing to test memory region 7fee78000000 (12648448 bytes)
*** Preparing to test memory region 7fee7c000000 (12709888 bytes)
*** Preparing to test memory region 7fee80000000 (12754944 bytes)
*** Preparing to test memory region 7fee84000000 (12640256 bytes)
*** Preparing to test memory region 7fee8845a000 (8388608 bytes)
*** Preparing to test memory region 7fee88c5b000 (8388608 bytes)
*** Preparing to test memory region 7fee8945c000 (8388608 bytes)
*** Preparing to test memory region 7fee89c5d000 (8388608 bytes)
*** Preparing to test memory region 7fee8a45e000 (8388608 bytes)
*** Preparing to test memory region 7fee8ac5f000 (8388608 bytes)
*** Preparing to test memory region 7fee8b460000 (8388608 bytes)
*** Preparing to test memory region 7fee8bc61000 (8388608 bytes)
*** Preparing to test memory region 7fee8c462000 (8388608 bytes)
*** Preparing to test memory region 7fee8cc63000 (8388608 bytes)
*** Preparing to test memory region 7fee8d464000 (8388608 bytes)
*** Preparing to test memory region 7fee8dc65000 (8388608 bytes)
*** Preparing to test memory region 7fee8e852000 (16384 bytes)
*** Preparing to test memory region 7fee8ea71000 (16384 bytes)
*** Preparing to test memory region 7fee8f1cc000 (143360 bytes)
*** Preparing to test memory region 7fee8f1ef000 (139264 bytes)
*** Preparing to test memory region 7fee8f233000 (16384 bytes)
*** Preparing to test memory region 7fee8f240000 (4096 bytes)
.O.Segmentation fault (core dumped)

comparision with envoy

envoy also supports redis cluster proxy (currently experimental), which IMHO might be an more elegant solution because of shorter latency, simpler server deployment, supports non-smart client(such as hiredis), more observability.

so I'm wondering whether we should use redis-cluster-proxy, switch to envoy, or use non-offical client like hiredis-vip (our app is written in C) .

When startup and secondary nodes are down, proxy can not start

When startup and secondary nodes are down(not first node), redis-cluster-proxy can not start.
Is there any way to start the proxy?

[2020-02-26 00:39:15.620/M] Redis Cluster Proxy v999.999.999 (unstable)
[2020-02-26 00:39:15.620/M] Commit: (4490fec3/0)
[2020-02-26 00:39:15.620/M] Git Branch: unstable
[2020-02-26 00:39:15.620/M] Cluster Address: redis-0-0:50080
[2020-02-26 00:39:15.620/M] PID: 1
[2020-02-26 00:39:15.620/M] OS: Linux 3.10.0-862.14.4.el7.x86_64 x86_64
[2020-02-26 00:39:15.620/M] Bits: 64
[2020-02-26 00:39:15.620/M] Log level: info
[2020-02-26 00:39:15.620/M] Listening on *:6379
[2020-02-26 00:39:15.620/M] Starting 8 threads...
[2020-02-26 00:39:15.621/M] Fetching cluster configuration...
Could not connect to Redis at 10.42.1.142:50081: No route to host
[2020-02-26 00:39:17.883/M] ERROR: Failed to fetch cluster configuration!
[2020-02-26 00:39:17.883/M] FATAL: failed to create thread 0.
cluster.c/fetchClusterConfiguration

while ((ln = listNext(&li))) {
    clusterNode *friend = ln->value;
    success = clusterNodeLoadInfo(cluster, friend, NULL, NULL); // I guess error has occurred here
    if (!success) {
        listDelNode(friends, ln);
        freeClusterNode(friend);
        goto cleanup;
    }
    clusterAddNode(cluster, friend);
}

Bug of the private cluster connection

connection-pool-size 1 port 8888
When i don't use private connection

image

First is slave, second is Ordinary connection, third is private connect pool,last is local connect.

when i process
127.0.0.1:8888> multi OK 127.0.0.1:8888> set b 1 QUEUED 127.0.0.1:8888> exec 1 OK
client list
image
and process it again
image
During this time, the connection to the proxy is not broken.

I found in function disableMultiplexingForClient() No conditions to determine whether the current connection is a private connection

Question about redis cluster proxy roadmap

hey, @artix75

Would you mind publishing the roadmap of this project? And whether feature or bug fix PR is welcome now? I would be happy to accomplish some todo features if it's ok to issue the PR, like the query redirection.

Thanks~

redis-cluster-proxy does not signal to clients AUTH-ing themselves that it lost the connection to the redis cluster

Probably caused by the same code as bug #71 however I'm making a separate bug report as this will be much harder to fix.

Basically, clients of redis-cluster-proxy (r-c-p for short) that need to track the state of the connection to the redis server do not get that information, leading to breakage.

Such as: a client uses AUTH and r-c-p gives them a special set of connections. But, if those connections fail midway - like if the server gets restarted, look into bug report #71 for examples of reproduction - the client does not get this information and is unable to correct its behaviour. As in, the TCP connection between the client and r-c-p shows no action, while the one between r-c-p and redis is clearly being torn down. On subsequent commands, r-c-p reopens (!) the connection to the redis server, but without AUTH and the client gets "-NOAUTH Authorization required", but not all clients are built to handle repeatedly authorizing themselves to the server.

Notice that, from the clients point of view, it performed a successful authorization, issued some commands that were executed successfully, and then out of the blue it starts getting NOAUTH as a response to its commands. This is super-tricky to handle in code. The rational conclusion from the perspective of the client code, assuming usage of redis-6, is that a human administrator changed the ACLs in the middle of the session. As the computer program (especially a library) is unable to cope with such events, the sanest thing to do is fail with massive errors. Which is what clients do, in general it seems. There does not seem to be a rational explanation why this would happen in redis-5 and earlier, if the client were talking directly to redis.

Question about idempotent

I have read the source code of redis cluster proxy. And as far as I am concerned, there may be situations that idempotent is not guaranted.
The proxy uses threads pool (default 8) to receive and dispatch commands and since thread is chosen by round robin, the commands may not be executed in the exactly same order as the client sent due to threads connection error or something alike.

make test exists with 0 even when test fail

When testing redis-cli was not not present on the machine, the ruby script reported 21 exception(s) occurred, however the make test still exited with code 0. That makes any CI\CD useless as there is no way to find out that something went wrong.

Question: Support for multiple Redis upstreams?

It appears that a single redis-cluster-proxy instance can only connect to a single Redis upstream (or single cluster).

Is this project open to supporting multiple Redis upstreams in the future?

Proxy should not intercept commands during MULTI / EXEC

Examples include PING and PROXY *:

expected (how redis-server behaves):

MULTI
OK
PING
QUEUED
EXEC
1) PONG

actual:

MULTI
OK
PING
PONG
EXEC
(empty list or set)

similarly, this seems wrong:

MULTI
OK
PROXY MULTIPLEXING STATUS
off
EXEC
(empty list or set)

I would expect:

MULTI
OK
PROXY MULTIPLEXING STATUS
QUEUED
EXEC
1) off

Ability to bind to 0.0.0.0 instead of localhost

So I'm trying to setup redis-cluster-proxy in openshift, but to have the server reachable from a different machine I had to use 0.0.0.0 in anetTcp6Server and anetTcp4Server.

-------------------------------- src/proxy.c ---------------------------------
index b2df63e..cbcd6f7 100644
@@ -2174,14 +2174,14 @@ void onClusterNodeDisconnection(clusterNode *node) {
 static int listen(void) {
     int fd_idx = 0;
     /* Try to use both IPv6 and IPv4 */
-    proxy.fds[fd_idx] = anetTcp6Server(proxy.neterr, config.port, NULL,
+    proxy.fds[fd_idx] = anetTcp6Server(proxy.neterr, config.port, "0.0.0.0",
                                        proxy.tcp_backlog);
     if (proxy.fds[fd_idx] != ANET_ERR)
         anetNonBlock(NULL, proxy.fds[fd_idx++]);
     else if (errno == EAFNOSUPPORT)
         proxyLogWarn("Not listening to IPv6: unsupported\n");
 
-    proxy.fds[fd_idx] = anetTcpServer(proxy.neterr, config.port, NULL,
+    proxy.fds[fd_idx] = anetTcpServer(proxy.neterr, config.port, "0.0.0.0",
                                       proxy.tcp_backlog);
     if (proxy.fds[fd_idx] != ANET_ERR)
         anetNonBlock(NULL, proxy.fds[fd_idx++]);

Shouldn't this be a config option ? I didn't see this in regular redis so now I'm all confused. All the server I ran on a different box usually have the option to bind on a different address (usually 0.0.0.0).

PROXY commands are not recognized

> echo "PROXY SHUTDOWN" | nc 10.126.3.84 26288 -ERR Unsupported subcommand 'SHUTDOWN' for command PROXY

> echo "PROXY CLUSTER UPDATE" | nc 10.126.3.84 26288 -ERR Unsupported subcommand 'CLUSTER' for command PROXY

> echo "PROXY CLUSTER INFO" | nc 10.126.3.84 26288 -ERR Unsupported subcommand 'CLUSTER' for command PROXY

The one that I found which works:

echo "PROXY INFO" | nc 10.126.3.84 26288
$308
#Proxy
proxy_version:0.0.1
os:Linux 5.3.0-26-generic x86_64
gcc_version:7.4.0
process_id:1
threads:1
tcp_port:26288
uptime_in_seconds:18006
uptime_in_days:0
config_file:/usr/local/etc/redis/proxy.conf

#Clients
connected_clients:1

#Cluster
address:10.128.29.145
entry_node:10.128.29.145:20477

The README describes a family of commands under the PROXY top-level, not all of which appear to work with the latest unstable build. https://github.com/artix75/redis-cluster-proxy#the-proxy-command

Option to select read preference?

It would be great if we could select read preference, for example:

  • read only from replicas
  • read from replicas, but fallback to master if not found (lag mitigation?)
  • read only from master (current default?)

Or something similar?
Thanks!

Improper handling of ASK retries

Currently MOVED and ASK retries are handled exactly the same, by attempting to update the hash slot map by fetching config from a node. However as per the spec (https://redis.io/topics/cluster-spec#cluster-live-reconfiguration), this is only correct for a MOVED response whereas the ASK response should be followed up by redirecting the query to the address indicated in the response but not attempt to update the local hash slot table.

Currently, the proxy will enter and endless loop when receiving an ASK response because it'll attempt to fetch the config, see that the config hasn't changed (because the hash slot move isn't completed) yet, receive another ASK response and so on.

I've got a test suite uncovering this problem and a supperrrr hacky fix #29. Feel free to take from that what you think makes sense, happy to just send a PR with the tests or something like that. Filed this issue to track the behavior.

ERR Failed to write to cluster

When I shutdown one master, and the original slave change to new master.
but the proxy can't feel the change and failed to write to cluster.

Error from server: ERR Cluster node disconnected: .XX.XX.XX.XX:6379
Error from server: ERR Failed to write to cluster

Cluster auth fails via config file, works fine on CLI

Disclaimer - I might be missing something obvious so apologies in advance if I am.

I am experiencing authentication errors when specifying a node password in a configuration file as opposed to the same password being passed on the cli.

Code for redis-cluster-proxy was compiled this morning (Feb 13) off of unstable. All tests passed.

Working params via shell

/data/redis/bin/redis-cluster-proxy --auth eatme 127.0.0.1:6381
[2020-02-13 17:27:51.631] Redis Cluster Proxy v999.999.999 (unstable)
[2020-02-13 17:27:51.631] Commit: (06c0f5ed/0)
[2020-02-13 17:27:51.631] Git Branch: unstable
[2020-02-13 17:27:51.631] Cluster Address: 127.0.0.1:6381
[2020-02-13 17:27:51.631] PID: 952
[2020-02-13 17:27:51.631] OS: Linux 3.10.0-1062.9.1.el7.x86_64 x86_64
[2020-02-13 17:27:51.631] Bits: 64
[2020-02-13 17:27:51.631] Log level: info
[2020-02-13 17:27:51.631] The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
[2020-02-13 17:27:51.631] Listening on *:7777
[2020-02-13 17:27:51.633] Starting 8 threads...
[2020-02-13 17:27:51.633] Fetching cluster configuration...
[2020-02-13 17:27:51.639] Cluster has 3 masters and 6 replica(s)
[2020-02-13 17:27:51.693] All thread(s) started!

Failure via config file

cat /data/redis/proxy.conf
auth eatme
cluster 127.0.0.1:6381

/data/redis/bin/redis-cluster-proxy -c /data/redis/proxy.conf
[2020-02-13 17:30:24.646] Redis Cluster Proxy v999.999.999 (unstable)
[2020-02-13 17:30:24.646] Commit: (06c0f5ed/0)
[2020-02-13 17:30:24.646] Git Branch: unstable
[2020-02-13 17:30:24.646] Cluster Address: 127.0.0.1:6381
[2020-02-13 17:30:24.646] PID: 2760
[2020-02-13 17:30:24.646] OS: Linux 3.10.0-1062.9.1.el7.x86_64 x86_64
[2020-02-13 17:30:24.646] Bits: 64
[2020-02-13 17:30:24.646] Log level: info
[2020-02-13 17:30:24.646] The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
[2020-02-13 17:30:24.646] Listening on *:7777
[2020-02-13 17:30:24.647] Starting 8 threads...
[2020-02-13 17:30:24.647] Fetching cluster configuration...
Failed to authenticate to node 127.0.0.1:6381: ERR invalid password
Failed to retrieve cluster configuration.
Cluster node 127.0.0.1:6381 replied with error:
NOAUTH Authentication required.
[2020-02-13 17:30:24.648] ERROR: Failed to fetch cluster configuration!
[2020-02-13 17:30:24.648] FATAL: failed to create thread 0.

System is RHEL7 with devtoolset-9 enabled (for GCC > 4.9).

Please let me know if I can provide further information or if I've missed something idiotic!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.