Coder Social home page Coder Social logo

redis-cluster-proxy's People

Contributors

artix75 avatar git-hulk avatar huangzhenliang avatar itamarhaber avatar oranagra avatar shooterit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

redis-cluster-proxy's Issues

the proxy.conf willl surport the requirepass parameter?

I have try to set the auth parameter , and it works.
But I don't want anyone use my redis-cluster through redis-cluster-proxy without password.
Will it support the requirepass parameter in porxy.conf to let me set up a password for redis-cluster-proxy ?

ubuntu@km2:~$ redis-cli -h km1 -p 7777
km1:7777> set aa bb
OK
km1:7777> keys aa
1) "aa"
km1:7777> get aa
"bb"

Question about idempotent

I have read the source code of redis cluster proxy. And as far as I am concerned, there may be situations that idempotent is not guaranted.
The proxy uses threads pool (default 8) to receive and dispatch commands and since thread is chosen by round robin, the commands may not be executed in the exactly same order as the client sent due to threads connection error or something alike.

proxy.h:79:5: err:unknow type ‘_Atomic’

GCC version is 6.1.0, on unstable branch.
Running a make, Error occurred:

In file included from cluster.c:31:0:
proxy.h:79:5: err:unknow type ‘_Atomic’
_Atomic uint64_t numclients;
^
proxy.h:79:22: err:expected ‘:’, ‘,’, ‘;’, ‘}’ or ‘attribute’ before ‘numclients’
_Atomic uint64_t numclients;
^

Getting error 'Could not create read handler: No such file or directory'

I am trying to put the proxy in front of a bunch of traffic and I'm getting those errors:

[2020-01-27 22:21:15.736] Could not create read handler: No such file or directory
[2020-01-27 22:21:15.736] Failed to create write handler for request
[2020-01-27 22:21:15.738] Could not create read handler: No such file or directory
[2020-01-27 22:21:15.738] Failed to create write handler for request
[2020-01-27 22:21:15.739] Could not create read handler: No such file or directory

I am using the connection_pool branch, and I have merged all commits from the unstable branch as of Mon Jan 27 14:24:09 PST 2020. I use the default config, no special args.

some issues with parent request might lead client blocked

Proxy will create child requests for some commands. For now, there might be some issues in processing request with child requests.

1, If error occurs while sending request to cluster node or receiving response from cluster node, request will be freed and an error msg will append to client buf. This is ok for request without child requests, but not for those parent requests. If a parent request is being freed, all the child request will being freed also. Client will loss some response, and min_reply_id will never being set to req->max_child_reply_id + 1. No matter what command client sent lately, it will not getting any response.

2, Proxy will reprocess requests when getting "MOVED" reply. It is not ok to reprocess request with command handleReply and not key command. It will do duplicateRequestForAllMasters again for every parent and child requests.

redis-cluster-proxy does not signal to clients AUTH-ing themselves that it lost the connection to the redis cluster

Probably caused by the same code as bug #71 however I'm making a separate bug report as this will be much harder to fix.

Basically, clients of redis-cluster-proxy (r-c-p for short) that need to track the state of the connection to the redis server do not get that information, leading to breakage.

Such as: a client uses AUTH and r-c-p gives them a special set of connections. But, if those connections fail midway - like if the server gets restarted, look into bug report #71 for examples of reproduction - the client does not get this information and is unable to correct its behaviour. As in, the TCP connection between the client and r-c-p shows no action, while the one between r-c-p and redis is clearly being torn down. On subsequent commands, r-c-p reopens (!) the connection to the redis server, but without AUTH and the client gets "-NOAUTH Authorization required", but not all clients are built to handle repeatedly authorizing themselves to the server.

Notice that, from the clients point of view, it performed a successful authorization, issued some commands that were executed successfully, and then out of the blue it starts getting NOAUTH as a response to its commands. This is super-tricky to handle in code. The rational conclusion from the perspective of the client code, assuming usage of redis-6, is that a human administrator changed the ACLs in the middle of the session. As the computer program (especially a library) is unable to cope with such events, the sanest thing to do is fail with massive errors. Which is what clients do, in general it seems. There does not seem to be a rational explanation why this would happen in redis-5 and earlier, if the client were talking directly to redis.

Ability to specify several entrypoints to the cluster

Hi there, @artix75!

First of all, this project looks like it could eliminate many pain points with using a Redis cluster for us, super exciting! Thanks a lot for working on this.

I've taken it for a test drive with our Redis cluster deployment and noticed a few things that would make it easier to use. For context, we run our Redis clusters in a container scheduler (think Kubernetes, Hashicorp Nomad etc) which implies that individual nodes will move around a lot.

  1. Being able to specify more than a single entrypoint to the cluster would allow a two-tier deployment with individually scalable redis & proxy groups. Additionally, the proxy would not fail to start if the one node it was assigned to just moved/crashed. As part of this, it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)
  2. Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
  • If any nodes were added but I have healthy instances to talk to -> no action required
  • If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes
    That'd allow us to not restart the proxy whenever one of the allocation moves.
  1. The last one is a potential bug which I'm working on a solid repro case for: As part of cluster bootstrapping, we do a CLUSTER RESET HARD. If the proxy connected to the node prior to doing this, it will not pick up the nodes that we start learning about afterwards - The cluster can be healthy but requests sent to the proxy will fail. Take this with a grain of salt - a few assumptions here, still investigating this.

I'd be happy to send pull requests for any of the points above if they align with your vision for this project. Additionally, do you have a published roadmap/next planned work items so we could start contributing a bit? :)

[Fa

sorry for miss operation.
Please delete this issue.

Entry node is single point of failure

When the Redis entrynode dies, the proxy will break when the next updateCluster invocation comes around because it can not talk to it's entry node anymore, causing cluster->broken to be set which just causes every new request to be replied to with an error. This disturbs consumer traffic when not actually needed - The cluster can still be fully healthy if the entry node happened to be a slave or a master from which all slots were migrated away.

I'd like to propose for the proxy to, on a failed configuration fetch from it's entry node, to try all (or a subset?) of all other nodes that it was aware of. This can take a non-trivial time but it seems preferable to just fully bricking the proxy.

A possible implementation might be to save ip+ports before reseting the cluster and trying them one by one until a valid configuration is fetched here: https://github.com/artix75/redis-cluster-proxy/blob/unstable/src/cluster.c#L827

Happy to send a PR but would like to get feedback on the approach first.

Note: Might be similar to #8, maybe there's a solution which solves both cases?

PROXY commands are not recognized

> echo "PROXY SHUTDOWN" | nc 10.126.3.84 26288 -ERR Unsupported subcommand 'SHUTDOWN' for command PROXY

> echo "PROXY CLUSTER UPDATE" | nc 10.126.3.84 26288 -ERR Unsupported subcommand 'CLUSTER' for command PROXY

> echo "PROXY CLUSTER INFO" | nc 10.126.3.84 26288 -ERR Unsupported subcommand 'CLUSTER' for command PROXY

The one that I found which works:

echo "PROXY INFO" | nc 10.126.3.84 26288
$308
#Proxy
proxy_version:0.0.1
os:Linux 5.3.0-26-generic x86_64
gcc_version:7.4.0
process_id:1
threads:1
tcp_port:26288
uptime_in_seconds:18006
uptime_in_days:0
config_file:/usr/local/etc/redis/proxy.conf

#Clients
connected_clients:1

#Cluster
address:10.128.29.145
entry_node:10.128.29.145:20477

The README describes a family of commands under the PROXY top-level, not all of which appear to work with the latest unstable build. https://github.com/artix75/redis-cluster-proxy#the-proxy-command

Option to select read preference?

It would be great if we could select read preference, for example:

  • read only from replicas
  • read from replicas, but fallback to master if not found (lag mitigation?)
  • read only from master (current default?)

Or something similar?
Thanks!

When startup and secondary nodes are down, proxy can not start

When startup and secondary nodes are down(not first node), redis-cluster-proxy can not start.
Is there any way to start the proxy?

[2020-02-26 00:39:15.620/M] Redis Cluster Proxy v999.999.999 (unstable)
[2020-02-26 00:39:15.620/M] Commit: (4490fec3/0)
[2020-02-26 00:39:15.620/M] Git Branch: unstable
[2020-02-26 00:39:15.620/M] Cluster Address: redis-0-0:50080
[2020-02-26 00:39:15.620/M] PID: 1
[2020-02-26 00:39:15.620/M] OS: Linux 3.10.0-862.14.4.el7.x86_64 x86_64
[2020-02-26 00:39:15.620/M] Bits: 64
[2020-02-26 00:39:15.620/M] Log level: info
[2020-02-26 00:39:15.620/M] Listening on *:6379
[2020-02-26 00:39:15.620/M] Starting 8 threads...
[2020-02-26 00:39:15.621/M] Fetching cluster configuration...
Could not connect to Redis at 10.42.1.142:50081: No route to host
[2020-02-26 00:39:17.883/M] ERROR: Failed to fetch cluster configuration!
[2020-02-26 00:39:17.883/M] FATAL: failed to create thread 0.
cluster.c/fetchClusterConfiguration

while ((ln = listNext(&li))) {
    clusterNode *friend = ln->value;
    success = clusterNodeLoadInfo(cluster, friend, NULL, NULL); // I guess error has occurred here
    if (!success) {
        listDelNode(friends, ln);
        freeClusterNode(friend);
        goto cleanup;
    }
    clusterAddNode(cluster, friend);
}

Thoughts on supporting MONITOR command

That command is super useful as a low key way to figure out what's going on in redis. I wonder how hard it would be to support.

Maybe it could have parameter if we want to monitor a special node, but without params it would just list every commands received by the proxy. Not sure how to do that in scalable way, as synchronization will be certainly required.

Bug of the private cluster connection

connection-pool-size 1 port 8888
When i don't use private connection

image

First is slave, second is Ordinary connection, third is private connect pool,last is local connect.

when i process
127.0.0.1:8888> multi OK 127.0.0.1:8888> set b 1 QUEUED 127.0.0.1:8888> exec 1 OK
client list
image
and process it again
image
During this time, the connection to the proxy is not broken.

I found in function disableMultiplexingForClient() No conditions to determine whether the current connection is a private connection

Question about redis cluster proxy roadmap

hey, @artix75

Would you mind publishing the roadmap of this project? And whether feature or bug fix PR is welcome now? I would be happy to accomplish some todo features if it's ok to issue the PR, like the query redirection.

Thanks~

Crash 12 threads with 9 masters and 9 replicas

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-02-03 08:37:36.040] Redis Cluster Prxoy 999.999.999 crashed by signal: 11
[2020-02-03 08:37:36.040] Crashed running the instruction at: 0x55ad33061676
[2020-02-03 08:37:36.040] Accessing address: 0x7fee8f2115f0
[2020-02-03 08:37:36.040] Handling crash on thread: 2

------ STACK TRACE ------
EIP:
/usr/local/bin/redis-cluster-proxy(aeProcessEvents+0x156)[0x55ad33061676]

Backtrace:
/usr/local/bin/redis-cluster-proxy(logStackTrace+0x44)[0x55ad33065af4]
/usr/local/bin/redis-cluster-proxy(sigsegvHandler+0xed)[0x55ad3306619d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7fee8e868890]
/usr/local/bin/redis-cluster-proxy(aeProcessEvents+0x156)[0x55ad33061676]
/usr/local/bin/redis-cluster-proxy(aeMain+0x2b)[0x55ad33061a5b]
/usr/local/bin/redis-cluster-proxy(+0x12b1d)[0x55ad33069b1d]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7fee8e85d6db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fee8e58688f]

------ INFO OUTPUT ------

Proxy

proxy_version:999.999.999
proxy_git_sha1:f8dc227a
proxy_git_dirty:0
os:Linux 4.15.0-34-generic x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:7.4.0
process_id:15625
threads:12
tcp_port:6363
uptime_in_seconds:13
uptime_in_days:0
config_file:
acl_user:default

Memory

used_memory:169477384
used_memory_human:161.63M
total_system_memory:67469664256
total_system_memory_human:62.84G

Clients

connected_clients:4059

Cluster

address:94.130.70.112
entry_node:94.130.70.112:6379

------ REGISTERS ------

RAX:0000000000000000 RBX:0000000000000001
RCX:000055ad33079c30 RDX:000055ad33085f60
RDI:000055ad33085f4c RSI:0000000000000008
RBP:00007fee8f2115f0 RSP:00007fee8d461e70
R8 :000055ad3472c603 R9 :000000000000aeb0
R10:00007fee8d460d8c R11:0000000000000246
R12:000055ad33b916d0 R13:0000000000000000
R14:0000000000000001 R15:000000000000002f
RIP:000055ad33061676 EFL:0000000000010246
CSGSFS:002b000000000033
(00007fee8d461e7f) -> 000055ad33069b1d
(00007fee8d461e7e) -> 000055ad33a4beb0
(00007fee8d461e7d) -> 000055ad33061a5b
(00007fee8d461e7c) -> 00007ffc995a9e40
(00007fee8d461e7b) -> 000055ad33a4beb0
(00007fee8d461e7a) -> 0000000000000000
(00007fee8d461e79) -> 00007fee8d461fc0
(00007fee8d461e78) -> 0000000000000000
(00007fee8d461e77) -> 000055ad33b916d0
(00007fee8d461e76) -> 0000000000000000
(00007fee8d461e75) -> a2087e4bebd11200
(00007fee8d461e74) -> 0000000000000000
(00007fee8d461e73) -> 000055ad33071a9e
(00007fee8d461e72) -> 00007ffc995a9e40
(00007fee8d461e71) -> 000055ad0000000b
(00007fee8d461e70) -> 000055ad33b916d0

------ FAST MEMORY TEST ------
*** Preparing to test memory region 55ad33290000 (4096 bytes)
*** Preparing to test memory region 55ad337c3000 (22859776 bytes)
*** Preparing to test memory region 7fee50000000 (12791808 bytes)
*** Preparing to test memory region 7fee58000000 (12701696 bytes)
*** Preparing to test memory region 7fee60000000 (12677120 bytes)
*** Preparing to test memory region 7fee64000000 (12787712 bytes)
*** Preparing to test memory region 7fee68000000 (12652544 bytes)
*** Preparing to test memory region 7fee6c000000 (12603392 bytes)
*** Preparing to test memory region 7fee70000000 (12726272 bytes)
*** Preparing to test memory region 7fee74000000 (12722176 bytes)
*** Preparing to test memory region 7fee78000000 (12648448 bytes)
*** Preparing to test memory region 7fee7c000000 (12709888 bytes)
*** Preparing to test memory region 7fee80000000 (12754944 bytes)
*** Preparing to test memory region 7fee84000000 (12640256 bytes)
*** Preparing to test memory region 7fee8845a000 (8388608 bytes)
*** Preparing to test memory region 7fee88c5b000 (8388608 bytes)
*** Preparing to test memory region 7fee8945c000 (8388608 bytes)
*** Preparing to test memory region 7fee89c5d000 (8388608 bytes)
*** Preparing to test memory region 7fee8a45e000 (8388608 bytes)
*** Preparing to test memory region 7fee8ac5f000 (8388608 bytes)
*** Preparing to test memory region 7fee8b460000 (8388608 bytes)
*** Preparing to test memory region 7fee8bc61000 (8388608 bytes)
*** Preparing to test memory region 7fee8c462000 (8388608 bytes)
*** Preparing to test memory region 7fee8cc63000 (8388608 bytes)
*** Preparing to test memory region 7fee8d464000 (8388608 bytes)
*** Preparing to test memory region 7fee8dc65000 (8388608 bytes)
*** Preparing to test memory region 7fee8e852000 (16384 bytes)
*** Preparing to test memory region 7fee8ea71000 (16384 bytes)
*** Preparing to test memory region 7fee8f1cc000 (143360 bytes)
*** Preparing to test memory region 7fee8f1ef000 (139264 bytes)
*** Preparing to test memory region 7fee8f233000 (16384 bytes)
*** Preparing to test memory region 7fee8f240000 (4096 bytes)
.O.Segmentation fault (core dumped)

Proxy returns MOVED responses

I have noticed that the proxy appears to return MOVED responses, therefore not hiding slot movements properly. Looking at the code, this appears to be confined to transactions currently (https://github.com/artix75/redis-cluster-proxy/blob/unstable/src/proxy.c#L4225) but I may be missing other parts of it.

In our case, this lead to our 'smart' client switching to cluster mode and trying to bypass the proxy. Solutions I can see:

  1. Retry transactions(?)
  2. Rewrite error messages to not follow standard MOVED format. i.e. replace with cancelled due to cluster topology change

Q: Works for sentinel?

Hello,

Firstly, congrats on Redis 6 release👍

Today, my org is using Sentinel and was wondering if this proxy works for sentinel as well, or if there exists such a proxy?

Reason we use Sentinel is our use case is for HA/Failover only, no need for horizontal sharding at this time.

Improper handling of ASK retries

Currently MOVED and ASK retries are handled exactly the same, by attempting to update the hash slot map by fetching config from a node. However as per the spec (https://redis.io/topics/cluster-spec#cluster-live-reconfiguration), this is only correct for a MOVED response whereas the ASK response should be followed up by redirecting the query to the address indicated in the response but not attempt to update the local hash slot table.

Currently, the proxy will enter and endless loop when receiving an ASK response because it'll attempt to fetch the config, see that the config hasn't changed (because the hash slot move isn't completed) yet, receive another ASK response and so on.

I've got a test suite uncovering this problem and a supperrrr hacky fix #29. Feel free to take from that what you think makes sense, happy to just send a PR with the tests or something like that. Filed this issue to track the behavior.

Potential memory and connection leak in sendMessageToThread

For now, if pipe is not yet writable, function sendMessageToThread will register a write event and continue sending message in callback handlePendingAwakeMessages. Function handlePendingAwakeMessages will again call sendMessageToThread to do the work. This might cause some issues.

1, If error occurs in sendMessageToThread called by handlePendingAwakeMessages, object client being sent will not be freed, memory and connection will not be freed;

2, sds msg = ln->value; int sent = sendMessageToThread(thread, msg); if (sent == -1) continue; else { listDelNode(thread->pending_messages, ln); if (!sent) { proxyLogErr("Failed to send message to thread %d", thread->thread_id); } }
In function handlePendingAwakeMessages, it will not delete msg from pending_messages list if sendMessageToThread return -1. But actually in case returning -1, sendMessageToThread already add remaining buf to the tail of pending_messages list. So we have duplicate messages in the list pointing to the same content.

3, In some situations, multiple msgs being sent by sendMessageToThread might cross sent

Getting 'Could not create read handler' errors

My cluster seems ok, no restarts of any nodes in the cluster. I have enabled the 'multiple-endpoint' features.

[2020-03-24 05:25:21.975/5] Populate connection pool: failed to install write handler for node 172.26.32.220:6379
[2020-03-24 05:25:21.976/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:21.976/5] Populate connection pool: failed to install write handler for node 172.25.145.138:6379
[2020-03-24 05:25:21.976/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:21.976/5] Populate connection pool: failed to install write handler for node 172.27.86.24:6379
[2020-03-24 05:25:21.976/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:21.976/5] Populate connection pool: failed to install write handler for node 172.27.36.226:6379
[2020-03-24 05:25:22.001/5] Could not create read handler: No such file or directory
[2020-03-24 05:25:22.001/5] ERROR: Failed to create read query handler for client 5:762 from 172.26.199.114:57978
[2020-03-24 05:25:54.529/6] Could not create read handler: No such file or directory
[2020-03-24 05:25:54.529/6] ERROR: Failed to create read query handler for client 6:904 from 172.25.212.157:36710

I am at this commit:

commit 6751bf515fcef0a46c273f0199e49794592529ec (origin/unstable, origin/HEAD)
Author: artix <[email protected]>
Date:   Wed Mar 18 16:49:26 2020 +0100

    Use exit code 1 when test fails (Fix #46)

Crash from `proxy cluster update`

To reproduce:

  1. Start 6 node cluster: 3 master, 3 slave.
  2. Create the cluster using redis-cli --cluster create. Default options.
  3. Start redis-cluster-proxy. Default options.
  4. Connect to proxy via redis-cli.
  5. Run the following series of commands:
  • proxy cluster info
  • proxy cluster update
  • proxy cluster info
  • proxy cluster update.

This will crash the proxy, with the following error message:

redis-cluster-proxy(77503,0x700009d67000) malloc: *** error for object 0x7fa03fa0f920: pointer being freed was not allocated
redis-cluster-proxy(77503,0x700009d67000) malloc: *** set a breakpoint in malloc_error_break to debug

Note that the second proxy cluster info command does not return the expected results.

If you start fresh again and run the following in sequence:

  • proxy cluster info
  • proxy cluster update
  • proxy cluster update

You get a different, but seemingly related crash. The first crash does not produce a bug report, but here is the report for the second sequence:

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-06-02 14:18:48.539/0] Redis Cluster Proxy 999.999.999 crashed by signal: 11
[2020-06-02 14:18:48.539/0] Crashed running the instruction at: 0x10476f0b3
[2020-06-02 14:18:48.539/0] Accessing address: 0x0
[2020-06-02 14:18:48.539/0] Handling crash on thread: 0


------ STACK TRACE ------
EIP:
0   redis-cluster-proxy                 0x000000010476f0b3 listRelease + 35

Backtrace:
0   redis-cluster-proxy                 0x0000000104774c22 logStackTrace + 114
1   redis-cluster-proxy                 0x000000010477501f sigsegvHandler + 575
2   libsystem_platform.dylib            0x00007fff6ee215fd _sigtramp + 29
3   ???                                 0x0000000000000000 0x0 + 0
4   redis-cluster-proxy                 0x0000000104772320 resetCluster + 64
5   redis-cluster-proxy                 0x0000000104773a99 updateCluster + 697
6   redis-cluster-proxy                 0x000000010477c81a proxyCommand + 4122
7   redis-cluster-proxy                 0x0000000104780e5d processRequest + 989
8   redis-cluster-proxy                 0x0000000104782c3e readQuery + 494
9   redis-cluster-proxy                 0x00000001047700d8 aeProcessEvents + 728
10  redis-cluster-proxy                 0x000000010477041b aeMain + 43
11  redis-cluster-proxy                 0x00000001047865d4 execProxyThread + 52
12  libsystem_pthread.dylib             0x00007fff6ee2d109 _pthread_start + 148
13  libsystem_pthread.dylib             0x00007fff6ee28b8b thread_start + 15


------ INFO OUTPUT ------
# Proxy
proxy_version:999.999.999
proxy_git_sha1:ac83840d
proxy_git_dirty:0
proxy_git_branch:unstable
os:Darwin 19.4.0 x86_64
arch_bits:64
multiplexing_api:kqueue
gcc_version:4.2.1
process_id:77962
threads:8
tcp_port:7777
uptime_in_seconds:9
uptime_in_days:0
config_file:./proxy.conf
acl_user:default

# Memory
used_memory:9540368
used_memory_human:9.10M
total_system_memory:17179869184
total_system_memory_human:16.00G

# Clients
connected_clients:1
max_clients:10000
thread_0_clinets:1
thread_1_clinets:0
thread_2_clinets:0
thread_3_clinets:0
thread_4_clinets:0
thread_5_clinets:0
thread_6_clinets:0
thread_7_clinets:0

# Cluster
address:
entry_node::0


---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64


------ REGISTERS ------

RAX:0000000000000b03 RBX:3532313366012828
RCX:0000000000000000 RDX:00000000000ecbb0
RDI:00007fee3040f060 RSI:00007fee32900000
RBP:0000700001b72ae0 RSP:0000700001b72ac0
R8 :0000000000000005 R9 :0000000000000001
R10:00007fee32900000 R11:00007fee329043e0
R12:00007fee32913340 R13:0000000000000006
R14:00007fee3040f060 R15:0000000000636432
RIP:000000010476f0b3 EFL:0000000000010202
CS :000000000000002b FS:0000000000000000  GS:0000000000000000
(0000700001b72acf) -> 0000000000000000
(0000700001b72ace) -> 0000000000000000
(0000700001b72acd) -> 0000000000000000
(0000700001b72acc) -> 0000000000000000
(0000700001b72acb) -> 0000000104773a99
(0000700001b72aca) -> 0000700001b72d70
(0000700001b72ac9) -> 00007fee329043e3
(0000700001b72ac8) -> 0000000000000000
(0000700001b72ac7) -> 0000000000000006
(0000700001b72ac6) -> 00007fee32913340
(0000700001b72ac5) -> 0000000104772320
(0000700001b72ac4) -> 0000700001b72b10
(0000700001b72ac3) -> 00007fee30706bf0
(0000700001b72ac2) -> 00007fee30706bf0
(0000700001b72ac1) -> 00007fee32913340
(0000700001b72ac0) -> 0000000000000000


------ DUMPING CODE AROUND EIP ------
Symbol: listRelease (base: 0x10476f090)
Module: /usr/local/bin/redis-cluster-proxy (base 0x10476e000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x10476f090 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
dump of function  (hexdump of 163 bytes):
554889e5415741564154534989fe4c8b7f284d85ff742f498b1e660f1f44000049ffcf4c8b6308498b46184885c07406488b7b10ffd04889dfe8f2ea01004c89e34d85ff75da49c746280000000049c746080000000049c706000000004c89f75b415c415e415f5de9c3ea01000f1f00554889e54156534989f64889fbbf18000000e889e901004885c074234c897010488b4b284885c9741a48c70000000000488b13
Function at 0x10478dbc0 is zfree
Function at 0x10478daa0 is zmalloc


=== PROXY BUG REPORT END. Make sure to include from START to END. ===

Crash when using BLPOP and MGET

Version

unstable branch
commit 6751bf5

Steps to Reproduce

> redis-cli -p 7777
127.0.0.1:7777> PROXY CONFIG SET enable-cross-slot 1
OK
127.0.0.1:7777> blpop a b 0
(error) ERR Cross-slot queries are not supported for this command
127.0.0.1:7777> mget a b
^C
Wait 5 seconds and type Ctrl-C to stop the redis-client could sometimes trigger proxy crash.

Crash Log:

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-03-24 15:16:55.273/0] Redis Cluster Proxy 999.999.999 crashed by signal: 11
[2020-03-24 15:16:55.273/0] Crashed running the instruction at: 0x10fbd05c5
[2020-03-24 15:16:55.273/0] Accessing address: 0x0
[2020-03-24 15:16:55.273/0] Handling crash on thread: 0


------ STACK TRACE ------
EIP:
0   redis-cluster-proxy                 0x000000010fbd05c5 listNext + 21

Backtrace:
0   redis-cluster-proxy                 0x000000010fbd5fc2 logStackTrace + 114
1   redis-cluster-proxy                 0x000000010fbd63c1 sigsegvHandler + 577
2   libsystem_platform.dylib            0x00007fff74483b5d _sigtramp + 29
3   ???                                 0x000000000000ffff 0x0 + 65535
4   redis-cluster-proxy                 0x000000010fbe1618 freeClient + 472
5   redis-cluster-proxy                 0x000000010fbd1328 aeProcessEvents + 744
6   redis-cluster-proxy                 0x000000010fbd166b aeMain + 43
7   redis-cluster-proxy                 0x000000010fbe7914 execProxyThread + 52
8   libsystem_pthread.dylib             0x00007fff7448c2eb _pthread_body + 126
9   libsystem_pthread.dylib             0x00007fff7448f249 _pthread_start + 66
10  libsystem_pthread.dylib             0x00007fff7448b40d thread_start + 13


------ INFO OUTPUT ------
# Proxy
proxy_version:999.999.999
proxy_git_sha1:6751bf51
proxy_git_dirty:0
proxy_git_branch:unstable
os:Darwin 18.7.0 x86_64
arch_bits:64
multiplexing_api:kqueue
gcc_version:4.2.1
process_id:98168
threads:1
tcp_port:7777
uptime_in_seconds:10
uptime_in_days:0
config_file:
acl_user:default

# Memory
used_memory:899664
used_memory_human:878.58K
total_system_memory:8589934592
total_system_memory_human:8.00G

# Clients
connected_clients:0
max_clients:10000

# Cluster
address:
entry_node::0


---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64


------ REGISTERS ------

RAX:d000000000000000 RBX:00007fa5f0f01d20
RCX:0000000000000001 RDX:00000000000f66d0
RDI:000070000280fe20 RSI:00007fa5f0c00000
RBP:000070000280fe10 RSP:000070000280fe10
R8 :0000000000000003 R9 :0000000000000000
R10:0000000000000004 R11:0000000000000004
R12:00007fa5f0d00e60 R13:00007fa5f0f00c6b
R14:0000000000000001 R15:000070000280fe20
RIP:000000010fbd05c5 EFL:0000000000010246
CS :000000000000002b FS:0000000000000000  GS:0000000000000000
(000070000280fe1f) -> 0000000b00000001
(000070000280fe1e) -> 0000000000000014
(000070000280fe1d) -> 0000000000000001
(000070000280fe1c) -> 00007fa5f0f00270
(000070000280fe1b) -> 000000010fbd1328
(000070000280fe1a) -> 000070000280fed0
(000070000280fe19) -> 0000000000000000
(000070000280fe18) -> 0000000000000001
(000070000280fe17) -> 00007fa5f0f00270
(000070000280fe16) -> 0000000000000001
(000070000280fe15) -> 000000010fc4c000
(000070000280fe14) -> 00007fa5f0f00270
(000070000280fe13) -> 0000000000000000
(000070000280fe12) -> d000000000000000
(000070000280fe11) -> 000000010fbe1618
(000070000280fe10) -> 000070000280fe60


------ DUMPING CODE AROUND EIP ------
Symbol: listNext (base: 0x10fbd05b0)
Module: /xxxxxx/redis-cluster-proxy/./src/redis-cluster-proxy (base 0x10fbcf000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x10fbd05b0 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
dump of function  (hexdump of 149 bytes):
554889e5488b074885c0741031c9837f08000f94c1488b0cc848890f5dc36690554889e5415741564154534889fbbf30000000e838ea01004885c00f84840100004989c648c740280000000048c740200000000048c740180000000048c740100000000048c740080000000048c70000000000f30f6f4310f30f7f4010488b4320498946204c8b3b4d85ff0f843701000066480f7e
Function at 0x10fbef020 is zmalloc


=== PROXY BUG REPORT END. Make sure to include from START to END. ===

       Please report the crash by opening an issue on github:

           https://github.com/artix75/redis-cluster-proxy/issues

[1]    98168 segmentation fault  ./src/redis-cluster-proxy --threads 1 127.0.0.1:8899

Ability to bind to 0.0.0.0 instead of localhost

So I'm trying to setup redis-cluster-proxy in openshift, but to have the server reachable from a different machine I had to use 0.0.0.0 in anetTcp6Server and anetTcp4Server.

-------------------------------- src/proxy.c ---------------------------------
index b2df63e..cbcd6f7 100644
@@ -2174,14 +2174,14 @@ void onClusterNodeDisconnection(clusterNode *node) {
 static int listen(void) {
     int fd_idx = 0;
     /* Try to use both IPv6 and IPv4 */
-    proxy.fds[fd_idx] = anetTcp6Server(proxy.neterr, config.port, NULL,
+    proxy.fds[fd_idx] = anetTcp6Server(proxy.neterr, config.port, "0.0.0.0",
                                        proxy.tcp_backlog);
     if (proxy.fds[fd_idx] != ANET_ERR)
         anetNonBlock(NULL, proxy.fds[fd_idx++]);
     else if (errno == EAFNOSUPPORT)
         proxyLogWarn("Not listening to IPv6: unsupported\n");
 
-    proxy.fds[fd_idx] = anetTcpServer(proxy.neterr, config.port, NULL,
+    proxy.fds[fd_idx] = anetTcpServer(proxy.neterr, config.port, "0.0.0.0",
                                       proxy.tcp_backlog);
     if (proxy.fds[fd_idx] != ANET_ERR)
         anetNonBlock(NULL, proxy.fds[fd_idx++]);

Shouldn't this be a config option ? I didn't see this in regular redis so now I'm all confused. All the server I ran on a different box usually have the option to bind on a different address (usually 0.0.0.0).

Install a crash handler (SEGFAULT)

I'm getting a couple of restarts, and they don't seem to be related to high memory usage. Having a crash handler that display something on the console if a crash happen could help. I believe that redis has that.

redis-proxy-30-29wf5          1/1     Running       1          17m
redis-proxy-30-65zgp          1/1     Running       6          17m
redis-proxy-30-klq7z          1/1     Running       7          17m
redis-proxy-30-l42n6          1/1     Running       6          17m
redis-proxy-30-pfj8h          1/1     Running       5          17m
redis-proxy-30-q2bgk          1/1     Running       6          17m

redis-cluster-proxy crashing

The version is the release 1.0-beta2.
The crash happened when the master to which redis-cluster-proxy connected and used as primary got suddenly unavailable.

crash log:
2020-05-12T09:01:41.809020738Z stdout F === PROXY BUG REPORT START: Cut & paste starting from here ===
2020-05-12T09:01:41.809037669Z stdout F [2020-05-12 09:01:41.808/0] Redis Cluster Proxy 0.9.102 crashed by signal: 11
2020-05-12T09:01:41.809054952Z stdout F [2020-05-12 09:01:41.808/0] Crashed running the instruction at: 0x408c84
2020-05-12T09:01:41.809076298Z stdout F [2020-05-12 09:01:41.808/0] Accessing address: 0x8
2020-05-12T09:01:41.809141611Z stdout F [2020-05-12 09:01:41.808/0] Handling crash on thread: 0
2020-05-12T09:01:41.809172261Z stdout F
2020-05-12T09:01:41.809186323Z stdout F
2020-05-12T09:01:41.809201293Z stdout F ------ STACK TRACE ------
2020-05-12T09:01:41.809246189Z stdout F EIP:
2020-05-12T09:01:41.809285105Z stdout F redis-cluster-proxy(listEmpty+0x24)[0x408c84]
2020-05-12T09:01:41.809299386Z stdout F
2020-05-12T09:01:41.80931411Z stdout F Backtrace:
2020-05-12T09:01:41.80932926Z stdout F redis-cluster-proxy(logStackTrace+0x2d)[0x40dc4d]
2020-05-12T09:01:41.809344458Z stdout F redis-cluster-proxy(sigsegvHandler+0x17a)[0x40e25a]
2020-05-12T09:01:41.80936098Z stdout F /lib64/libpthread.so.0(+0x132d0)[0x7f82fb59b2d0]
2020-05-12T09:01:41.809377083Z stdout F redis-cluster-proxy(listEmpty+0x24)[0x408c84]
2020-05-12T09:01:41.809399569Z stdout F redis-cluster-proxy(listRelease+0x9)[0x408cd9]
2020-05-12T09:01:41.809493698Z stdout F redis-cluster-proxy(resetCluster+0x36)[0x40b2b6]
2020-05-12T09:01:41.809521301Z stdout F redis-cluster-proxy(updateCluster+0x1ca)[0x40cf3a]
2020-05-12T09:01:41.809538809Z stdout F redis-cluster-proxy(proxyCommand+0x141f)[0x4193bf]
2020-05-12T09:01:41.809556442Z stdout F redis-cluster-proxy(processRequest+0x2d5)[0x41b7d5]
2020-05-12T09:01:41.809575586Z stdout F redis-cluster-proxy(readQuery+0x1ea)[0x41c96a]
2020-05-12T09:01:41.809610288Z stdout F redis-cluster-proxy(aeProcessEvents+0x101)[0x4098e1]
2020-05-12T09:01:41.809631791Z stdout F redis-cluster-proxy(aeMain+0x2b)[0x409cdb]
2020-05-12T09:01:41.809701201Z stdout F redis-cluster-proxy[0x411b8c]
2020-05-12T09:01:41.809738579Z stdout F /lib64/libpthread.so.0(+0x84f9)[0x7f82fb5904f9]
2020-05-12T09:01:41.809756341Z stdout F /lib64/libc.so.6(clone+0x3f)[0x7f82fb2c3f2f]
2020-05-12T09:01:41.8097703Z stdout F
2020-05-12T09:01:41.809782737Z stdout F
2020-05-12T09:01:41.809796559Z stdout F ------ INFO OUTPUT ------
2020-05-12T09:01:41.809846874Z stdout F # Proxy
2020-05-12T09:01:41.80988196Z stdout F proxy_version:0.9.102
2020-05-12T09:01:41.809896773Z stdout F proxy_git_sha1:00000000
2020-05-12T09:01:41.809910317Z stdout F proxy_git_dirty:0
2020-05-12T09:01:41.809927546Z stdout F proxy_git_branch:
2020-05-12T09:01:41.809947738Z stdout F os:Linux 4.15.0-99-generic x86_64
2020-05-12T09:01:41.809968636Z stdout F arch_bits:64
2020-05-12T09:01:41.810042283Z stdout F multiplexing_api:epoll
2020-05-12T09:01:41.810063659Z stdout F gcc_version:8.2.1
2020-05-12T09:01:41.810083387Z stdout F process_id:31
2020-05-12T09:01:41.810103715Z stdout F threads:8
2020-05-12T09:01:41.810122553Z stdout F tcp_port:7777
2020-05-12T09:01:41.810140832Z stdout F uptime_in_seconds:3780
2020-05-12T09:01:41.810176828Z stdout F uptime_in_days:0
2020-05-12T09:01:41.810192157Z stdout F config_file:
2020-05-12T09:01:41.810206659Z stdout F acl_user:default
2020-05-12T09:01:41.810219165Z stdout F
2020-05-12T09:01:41.810234292Z stdout F # Memory
2020-05-12T09:01:41.81024989Z stdout F used_memory:8189696
2020-05-12T09:01:41.810267234Z stdout F used_memory_human:7.81M
2020-05-12T09:01:41.810287134Z stdout F total_system_memory:33728958464
2020-05-12T09:01:41.810307505Z stdout F total_system_memory_human:31.41G
2020-05-12T09:01:41.81032145Z stdout F
2020-05-12T09:01:41.810338765Z stdout F # Clients
2020-05-12T09:01:41.810358186Z stdout F connected_clients:1
2020-05-12T09:01:41.810379253Z stdout F max_clients:10000
2020-05-12T09:01:41.810407401Z stdout F thread_0_clinets:1
2020-05-12T09:01:41.81042417Z stdout F thread_1_clinets:0
2020-05-12T09:01:41.810438885Z stdout F thread_2_clinets:0
2020-05-12T09:01:41.810452562Z stdout F thread_3_clinets:0
2020-05-12T09:01:41.810468214Z stdout F thread_4_clinets:0
2020-05-12T09:01:41.810492261Z stdout F thread_5_clinets:0
2020-05-12T09:01:41.810506579Z stdout F thread_6_clinets:0
2020-05-12T09:01:41.810521053Z stdout F thread_7_clinets:0
2020-05-12T09:01:41.810533371Z stdout F
2020-05-12T09:01:41.810547706Z stdout F # Cluster
2020-05-12T09:01:41.810563877Z stdout F address:
2020-05-12T09:01:41.810583278Z stdout F entry_node::0
2020-05-12T09:01:41.810610768Z stdout F
2020-05-12T09:01:41.810623832Z stdout F
2020-05-12T09:01:41.810639479Z stdout F ---- SIZEOF STRUCTS ----
2020-05-12T09:01:41.810653969Z stdout F clientRequest: 184
2020-05-12T09:01:41.810668392Z stdout F client: 224
2020-05-12T09:01:41.810691151Z stdout F redisClusterConnection: 48
2020-05-12T09:01:41.810715281Z stdout F clusterNode: 112
2020-05-12T09:01:41.810731337Z stdout F redisCluster: 104
2020-05-12T09:01:41.810744585Z stdout F list: 48
2020-05-12T09:01:41.81075805Z stdout F listNode: 24
2020-05-12T09:01:41.810780879Z stdout F rax: 24
2020-05-12T09:01:41.810797111Z stdout F raxNode: 4
2020-05-12T09:01:41.810811536Z stdout F raxIterator: 480
2020-05-12T09:01:41.810824488Z stdout F aeEventLoop: 88
2020-05-12T09:01:41.810838539Z stdout F aeFileEvent: 32
2020-05-12T09:01:41.810852474Z stdout F aeTimeEvent: 64
2020-05-12T09:01:41.810867864Z stdout F
2020-05-12T09:01:41.810886717Z stdout F
2020-05-12T09:01:41.810905355Z stdout F ------ REGISTERS ------
2020-05-12T09:01:41.810921757Z stdout F
2020-05-12T09:01:41.810941988Z stdout F RAX:0000000000000025 RBX:0000000000000000
2020-05-12T09:01:41.810960358Z stdout F RCX:0000000000000000 RDX:0000000002545550
2020-05-12T09:01:41.810994563Z stdout F RDI:00007f82f000bad0 RSI:00007f82fb57caa8
2020-05-12T09:01:41.811009414Z stdout F RBP:00007f82f00186ff RSP:00007f82fb1c6a80
2020-05-12T09:01:41.811025552Z stdout F R8 :00000000024e1d80 R9 :00007f82fb1c7700
2020-05-12T09:01:41.811039822Z stdout F R10:6e6972756769666e R11:0000000000000246
2020-05-12T09:01:41.81105407Z stdout F R12:0000000000000006 R13:00007f82f000bad0
2020-05-12T09:01:41.811071411Z stdout F R14:00007f82f0012763 R15:00007f82f0018d60
2020-05-12T09:01:41.811092855Z stdout F RIP:0000000000408c84 EFL:0000000000010206
2020-05-12T09:01:41.811109636Z stdout F CSGSFS:002b000000000033
2020-05-12T09:01:41.81112797Z stdout F (00007f82fb1c6a8f) -> 000000000041eaf8
2020-05-12T09:01:41.811147177Z stdout F (00007f82fb1c6a8e) -> 0000000000000000
2020-05-12T09:01:41.811239222Z stdout F (00007f82fb1c6a8d) -> 00007f8200000000
2020-05-12T09:01:41.811273634Z stdout F (00007f82fb1c6a8c) -> 0000000000000000
2020-05-12T09:01:41.811289104Z stdout F (00007f82fb1c6a8b) -> 00007f82f0015730
2020-05-12T09:01:41.811302989Z stdout F (00007f82fb1c6a8a) -> 00007f82000018eb
2020-05-12T09:01:41.811332337Z stdout F (00007f82fb1c6a89) -> 000000000040cf3a
2020-05-12T09:01:41.811350356Z stdout F (00007f82fb1c6a88) -> 00007f82f00157f0
2020-05-12T09:01:41.811364233Z stdout F (00007f82fb1c6a87) -> 000000000040b2b6
2020-05-12T09:01:41.811378488Z stdout F (00007f82fb1c6a86) -> 00000000024e39e0
2020-05-12T09:01:41.811427156Z stdout F (00007f82fb1c6a85) -> 0000000000408cd9
2020-05-12T09:01:41.811453613Z stdout F (00007f82fb1c6a84) -> 00000000024e39e0
2020-05-12T09:01:41.811468478Z stdout F (00007f82fb1c6a83) -> 0000000000000006
2020-05-12T09:01:41.81148219Z stdout F (00007f82fb1c6a82) -> 0000000000000000
2020-05-12T09:01:41.811496113Z stdout F (00007f82fb1c6a81) -> 00007f82f000bad0
2020-05-12T09:01:41.811509149Z stdout F (00007f82fb1c6a80) -> 00000000024e1d80
2020-05-12T09:01:41.811522359Z stdout F
2020-05-12T09:01:41.811537515Z stdout F
2020-05-12T09:01:41.811555272Z stdout F ------ DUMPING CODE AROUND EIP ------
2020-05-12T09:01:41.811575011Z stdout F Symbol: listEmpty (base: 0x408c60)
2020-05-12T09:01:41.811594552Z stdout F Module: redis-cluster-proxy (base 0x400000)
2020-05-12T09:01:41.811613902Z stdout F $ xxd -r -p /tmp/dump.hex /tmp/dump.bin
2020-05-12T09:01:41.811631723Z stdout F $ objdump --adjust-vma=0x408c60 -D -b binary -m i386:x86-64 /tmp/dump.bin
2020-05-12T09:01:41.811649746Z stdout F ------
2020-05-12T09:01:41.811668888Z stdout F dump of function (hexdump of 164 bytes):
2020-05-12T09:01:41.811701156Z stdout F 41554989fd415455534883ec08488b4728488b1f4885c0742f488d68ff0f1f00498b45184c8b63084885c07406488b7b10ffd04889df4883ed014c89e3e8deae01004883fdff75d849c745080000000049c745000000000049c74528000000004883c4085b5d415c415dc30f1f440000534889fbe887ffffff4889df5be99eae010066662e0f1f8400000000000f1f00554889f5534889fbbf180000004883ec08e86aad
2020-05-12T09:01:41.811753959Z stdout F Function at 0x423b80 is zfree
2020-05-12T09:01:41.811824796Z stdout F Function at 0x408c60 is listEmpty
2020-05-12T09:01:41.811838776Z stdout F
2020-05-12T09:01:41.81185604Z stdout F
2020-05-12T09:01:41.81187299Z stdout F === PROXY BUG REPORT END. Make sure to include from START to END. ===
2020-05-12T09:01:41.811891451Z stdout F
2020-05-12T09:01:41.811912467Z stdout F Please report the crash by opening an issue on github:
2020-05-12T09:01:41.81193608Z stdout F
2020-05-12T09:01:41.811960627Z stdout F https://github.com/artix75/redis-cluster-proxy/issues
2020-05-12T09:01:41.811978477Z stdout F
2020-05-12T09:01:42.291611403Z stderr F /usr/local/bin/start_redis_proxy.sh: line 130: 31 Segmentation fault (core dumped) redis-cluster-proxy

make test exists with 0 even when test fail

When testing redis-cli was not not present on the machine, the ruby script reported 21 exception(s) occurred, however the make test still exited with code 0. That makes any CI\CD useless as there is no way to find out that something went wrong.

Cluster auth fails via config file, works fine on CLI

Disclaimer - I might be missing something obvious so apologies in advance if I am.

I am experiencing authentication errors when specifying a node password in a configuration file as opposed to the same password being passed on the cli.

Code for redis-cluster-proxy was compiled this morning (Feb 13) off of unstable. All tests passed.

Working params via shell

/data/redis/bin/redis-cluster-proxy --auth eatme 127.0.0.1:6381
[2020-02-13 17:27:51.631] Redis Cluster Proxy v999.999.999 (unstable)
[2020-02-13 17:27:51.631] Commit: (06c0f5ed/0)
[2020-02-13 17:27:51.631] Git Branch: unstable
[2020-02-13 17:27:51.631] Cluster Address: 127.0.0.1:6381
[2020-02-13 17:27:51.631] PID: 952
[2020-02-13 17:27:51.631] OS: Linux 3.10.0-1062.9.1.el7.x86_64 x86_64
[2020-02-13 17:27:51.631] Bits: 64
[2020-02-13 17:27:51.631] Log level: info
[2020-02-13 17:27:51.631] The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
[2020-02-13 17:27:51.631] Listening on *:7777
[2020-02-13 17:27:51.633] Starting 8 threads...
[2020-02-13 17:27:51.633] Fetching cluster configuration...
[2020-02-13 17:27:51.639] Cluster has 3 masters and 6 replica(s)
[2020-02-13 17:27:51.693] All thread(s) started!

Failure via config file

cat /data/redis/proxy.conf
auth eatme
cluster 127.0.0.1:6381

/data/redis/bin/redis-cluster-proxy -c /data/redis/proxy.conf
[2020-02-13 17:30:24.646] Redis Cluster Proxy v999.999.999 (unstable)
[2020-02-13 17:30:24.646] Commit: (06c0f5ed/0)
[2020-02-13 17:30:24.646] Git Branch: unstable
[2020-02-13 17:30:24.646] Cluster Address: 127.0.0.1:6381
[2020-02-13 17:30:24.646] PID: 2760
[2020-02-13 17:30:24.646] OS: Linux 3.10.0-1062.9.1.el7.x86_64 x86_64
[2020-02-13 17:30:24.646] Bits: 64
[2020-02-13 17:30:24.646] Log level: info
[2020-02-13 17:30:24.646] The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
[2020-02-13 17:30:24.646] Listening on *:7777
[2020-02-13 17:30:24.647] Starting 8 threads...
[2020-02-13 17:30:24.647] Fetching cluster configuration...
Failed to authenticate to node 127.0.0.1:6381: ERR invalid password
Failed to retrieve cluster configuration.
Cluster node 127.0.0.1:6381 replied with error:
NOAUTH Authentication required.
[2020-02-13 17:30:24.648] ERROR: Failed to fetch cluster configuration!
[2020-02-13 17:30:24.648] FATAL: failed to create thread 0.

System is RHEL7 with devtoolset-9 enabled (for GCC > 4.9).

Please let me know if I can provide further information or if I've missed something idiotic!

Proxy should not intercept commands during MULTI / EXEC

Examples include PING and PROXY *:

expected (how redis-server behaves):

MULTI
OK
PING
QUEUED
EXEC
1) PONG

actual:

MULTI
OK
PING
PONG
EXEC
(empty list or set)

similarly, this seems wrong:

MULTI
OK
PROXY MULTIPLEXING STATUS
off
EXEC
(empty list or set)

I would expect:

MULTI
OK
PROXY MULTIPLEXING STATUS
QUEUED
EXEC
1) off

Crash in onClusterNodeDisconnection

The version is the release 1.0-beta2.

I will update once I find the way to reproduce it.
Hope the traceback can help.

Crash log:

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-04-27 14:24:46.985/0] Redis Cluster Proxy 0.9.102 crashed by signal: 11
[2020-04-27 14:24:46.985/0] Crashed running the instruction at: 0x415756
[2020-04-27 14:24:46.985/0] Accessing address: (nil)
[2020-04-27 14:24:46.985/0] Handling crash on thread: 0


------ STACK TRACE ------
EIP:
./redis-cluster-proxy(onClusterNodeDisconnection+0x16)[0x415756]

Backtrace:
./redis-cluster-proxy(logStackTrace+0x2d)[0x40d1dd]
./redis-cluster-proxy(sigsegvHandler+0x186)[0x40d7f6]
/lib64/libpthread.so.0(+0xf5d0)[0x7f0fb9bc15d0]
./redis-cluster-proxy(onClusterNodeDisconnection+0x16)[0x415756]
./redis-cluster-proxy[0x40a5ab]
./redis-cluster-proxy[0x40a6a1]
./redis-cluster-proxy(resetCluster+0x3e)[0x40a7ce]
./redis-cluster-proxy(updateCluster+0x1d9)[0x40c4e9]
./redis-cluster-proxy[0x41a1f4]
./redis-cluster-proxy(aeProcessEvents+0x291)[0x408f21]
./redis-cluster-proxy(aeMain+0x2b)[0x40920b]
./redis-cluster-proxy[0x41107c]
/lib64/libpthread.so.0(+0x7dd5)[0x7f0fb9bb9dd5]
/lib64/libc.so.6(clone+0x6d)[0x7f0fb98e302d]


------ INFO OUTPUT ------
# Proxy
proxy_version:0.9.102
proxy_git_sha1:00000000
proxy_git_dirty:0
proxy_git_branch:
os:Linux 3.10.0-957.21.3.el7.x86_64 x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:7.3.1
process_id:30681
threads:1
tcp_port:7777
uptime_in_seconds:599
uptime_in_days:0
config_file:proxy.conf
acl_user:default

# Memory
used_memory:2177168
used_memory_human:2.08M
total_system_memory:16479350784
total_system_memory_human:15.35G

# Clients
connected_clients:50
max_clients:10000
thread_0_clinets:50

# Cluster
address:
entry_node::0


---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64


------ REGISTERS ------

RAX:00007f0fb4009340 RBX:00007f0fb40aa3c0
RCX:00007f0fb4008b00 RDX:0000000000000045
RDI:00007f0fb40aa3c0 RSI:00007f0fb97e3b90
RBP:0000000000000000 RSP:00007f0fb97e3b30
R8 :00007f0fb4009340 R9 :00007f0fb4008bc0
R10:00000000fffffc00 R11:00007f0fb9971f40
R12:00007f0fb4008c00 R13:7878787878787878
R14:00000000022d4850 R15:00007f0fb40090a0
RIP:0000000000415756 EFL:0000000000010206
CSGSFS:0000000000000033
(00007f0fb97e3b3f) -> 000000000040a7ce
(00007f0fb97e3b3e) -> 00000000022d4850
(00007f0fb97e3b3d) -> 0000000000000000
(00007f0fb97e3b3c) -> 0000000000000045
(00007f0fb97e3b3b) -> 000000000040a6a1
(00007f0fb97e3b3a) -> 00007f0fb4008c00
(00007f0fb97e3b39) -> 0000000000000000
(00007f0fb97e3b38) -> 00000000022d4850
(00007f0fb97e3b37) -> 000000000040a5ab
(00007f0fb97e3b36) -> 0000000000000001
(00007f0fb97e3b35) -> 00007f0fb4008c00
(00007f0fb97e3b34) -> 0000000000000000
(00007f0fb97e3b33) -> 00007f0fb40aa3c0
(00007f0fb97e3b32) -> 0000000000000001
(00007f0fb97e3b31) -> 00007f0fb4019930
(00007f0fb97e3b30) -> 00007f0fb40090a0


------ DUMPING CODE AROUND EIP ------
Symbol: onClusterNodeDisconnection (base: 0x415740)
Module: ./redis-cluster-proxy (base 0x400000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x415740 -D -b binary -m i386:x86-64 /tmp/dump.bin
------
dump of function  (hexdump of 150 bytes):
4155415455534883ec184c8b2f4d85ed0f8490010000498b45004885c00f84830100008bb08400000085f60f8875010000488b4708488b151c4e22004889fd486308488b14ca4885d27445488b7a204885ff743cba03000000e8f232ffff488b450841c7451c000000004885c07529bfe2ba4200ba330c0000be60b74200e8cd82ffffbf01000000e89312ffff0f1f0041c7451c0000
Function at 0x408a90 is aeDeleteFileEvent
Function at 0x40da90 is _proxyAssert


=== PROXY BUG REPORT END. Make sure to include from START to END. ===

Redis Module supported?

Hello,
I love the idea of your proxy, thanks!
Is the Redis modules are supported ? (eg: RediSearch)

Thoughts on a minimal "info" command

Some clients use the info command for discovery purposes, looking for things like the redis server version (to select strategies such as del vs unlink, or other changes between versions).

What is the chance of a minimal info [section] command, even if it only had something like:

# Server
redis_version:5.0.4
redis_mode:proxy

? this would allow such clients to continue to make those decisions appropriately based on the underlying redis server (not the proxy) version. The reported server could be the minimum over the cluster, or could just be the version from the primary node used in the configuration.

(side note: is there any reason time can't be implemented using the time at the proxy?)

Running multiple instances of the proxy (question)

My thinking is that if one proxy dies, we want redis-client to be able to talk to another instance, and so have multiple one running at the same time (trying to have 4 now).

Is it possible to do load balancing and have multiple redis-proxy running, or is it problematic (I hope it's not !) ?

redis-cluster-proxy fails to AUTH when reopening a connection to the cluster

When redis-cluster-proxy (r-c-p for short) is connected to the cluster using authentication, it fails to AUTH to the cluster when the connections get recycled.

Instance one (easy to reproduce): r-c-p uses AUTH, clients do not. Connect r-c-p to a simple 3-instance redis cluster, all masters. When opening the initial connections to the cluster, r-c-p issues AUTH and everything works. Now restart ("service redis restart" or similar) any of the 3 redis masters. R-c-p will properly reconnect to the cluster, however it will not issue AUTH and all commands coming from r-c-ps clients will be unauthenticated and thus fail if using ACL.

Instance two (tedious to reproduce): r-c-p uses AUTH, clients do not. Connect r-c-p like before. Simply wait as connections will get eventually recycled. When they do, r-c-p will not issue AUTH on the new connections, thus again failing its clients.

A similar but far worse bug exists when clients use AUTH themselves. I will open a separate bug report for that as it is probably going to be much more difficult to fix.

Restarting r-c-p fixes the issue, but it is obvious that idea has no chance to survive in production.

On the network level, the problem is reproduced both when r-c-p receives a FIN packet from the redis server (and then they perform an orderly connection shutdown) and when it receives a RST packet.

Redis-cluster-proxy used in this case is built from git commit ac83840 on Ubuntu-20.04.

(minor / aesthetic) inline help does display big text for --port and --bind options

(venv) redis-cluster-proxy$ redis-cluster-proxy --help
Redis Cluster Proxy v999.999.999 (unstable)
Usage: redis-cluster-proxy [OPTIONS] [cluster_host:cluster_port]
  -c <file>            Configuration file
  -p, --port <port>    Port (default: 7777). Use 0 in order to disable                        TCP connections at all
  --max-clients <n>    Max clients (default: 10000)
  --threads <n>        Thread number (default: 8, max: 500)
  --tcpkeepalive       TCP Keep Alive (default: 300)
  --tcp-backlog        TCP Backlog (default: 511)
  --daemonize          Execute the proxy in background
  --unixsocket <sock_file>     UNIX socket path (empty by default)
  --unixsocketperm <mode>      UNIX socket permissions (default: 0)
  --bind <address>     Bind an interface (can be used multiple times                        to bind multiple interfaces)
  --disable-multiplexing <opt> When should multiplexing disabled
                               (never|auto|always) (default: auto)
  --enable-cross-slot  Enable cross-slot queries (warning: cross-slot
                       queries routed to multiple nodes cannot be atomic).
  -a, --auth <passw>   Authentication password
  --auth-user <name>   Authentication username
  --disable-colors     Disable colorized output
  --log-level <level>  Minimum log level: (default: info)
                       (debug|info|success|warning|error)
  --dump-queries       Dump query args (only for log-level 'debug') 
  --dump-buffer        Dump query buffer (only for log-level 'debug') 
  --dump-queues        Dump request queues (only for log-level 'debug') 
  -h, --help         Print this help

There's a bunch of space for the --bind and --port options. If we compare it to --log-level and --enable-cross-slot options which are tidier / more compact. This has to be a simple extra 'space' formatting thing.

Proxy does not detect failover if old master dies quickly

I managed to get the proxy to fail consistently with this repro case:

  • Send a CLUSTER FAILOVER command to a slave
  • Kill the old master that we fell away from after a successful failover
  • Proxy will continue to hammer the address of the old master because it did not get a MOVED response (which would cause a reconfiguration). This state goes on forever.

I haven't tested the "master fails w/o a successful failover before" case but it seems likely that that'd cause the same behaviour.

A possible solution here might be to trigger a reconfiguration whenever we loose a connection to a master?

ERR Failed to write to cluster

When I shutdown one master, and the original slave change to new master.
but the proxy can't feel the change and failed to write to cluster.

Error from server: ERR Cluster node disconnected: .XX.XX.XX.XX:6379
Error from server: ERR Failed to write to cluster

Question: Support for multiple Redis upstreams?

It appears that a single redis-cluster-proxy instance can only connect to a single Redis upstream (or single cluster).

Is this project open to supporting multiple Redis upstreams in the future?

Add dockerfiles ?

Here's a dockerfile to build this project. I don't know if you'd be interested in using it, or if you're familiar with those things. I find docker very convenient.

I noticed that redis source code doesn't have one, so if you want to stick to redis practices it probably make sense, just sharing those bits in case it's helpful. Those blips could go in a 'contrib' folder too, or go nowhere and stay on my branch :)

FROM alpine:3.11 as build

RUN apk add --no-cache gcc musl-dev linux-headers openssl-dev make

RUN addgroup -S app && adduser -S -G app app 
RUN chown -R app:app /opt
RUN chown -R app:app /usr/local

# There is a bug in CMake where we cannot build from the root top folder
# So we build from /opt
COPY --chown=app:app . /opt
WORKDIR /opt

USER app
RUN [ "make", "install" ]

FROM alpine:3.11 as runtime

RUN apk add --no-cache libstdc++
RUN apk add --no-cache strace
RUN apk add --no-cache python3
RUN apk add --no-cache redis

RUN addgroup -S app && adduser -S -G app app 
COPY --chown=app:app --from=build /usr/local/bin/redis-cluster-proxy /usr/local/bin/redis-cluster-proxy
RUN chmod +x /usr/local/bin/redis-cluster-proxy
RUN ldd /usr/local/bin/redis-cluster-proxy

# Copy source code for gcc
COPY --chown=app:app --from=build /opt /opt

# Now run in usermode
USER app
WORKDIR /home/app

ENTRYPOINT ["/usr/local/bin/redis-cluster-proxy"]
EXPOSE 7777
CMD ["redis-cluster-proxy"]

Separate makefile I use. It could be simplified and merged into the existing one.

.PHONY: docker

NAME   := ${DOCKER_REPO}/redis-cluster-proxy
TAG    := $(shell cat src/version.h | cut -d ' ' -f 3 | tr -d \")
IMG    := ${NAME}:${TAG}
LATEST := ${NAME}:latest
PROD   := ${NAME}:production
BUILD  := ${NAME}:build

docker_test:
	docker build -t ${BUILD} .

docker:
	git clean -dfx
	docker build -t ${IMG} .
	docker tag ${IMG} ${BUILD}

docker_push:
	docker tag ${IMG} ${PROD}
	docker push ${PROD}
	oc import-image redis-proxy:production # that thing is to trigger a deploy with openshift.

comparision with envoy

envoy also supports redis cluster proxy (currently experimental), which IMHO might be an more elegant solution because of shorter latency, simpler server deployment, supports non-smart client(such as hiredis), more observability.

so I'm wondering whether we should use redis-cluster-proxy, switch to envoy, or use non-offical client like hiredis-vip (our app is written in C) .

segfault if "PROXY MULTIPLEXING" is issued without final arg

127.0.0.1:7777> PROXY MULTIPLEXING
Could not connect to Redis at 127.0.0.1:7777: Connection refused

Boom!

=== PROXY BUG REPORT START: Cut & paste starting from here ===
[2020-04-08 15:56:51.866/0] === ASSERTION FAILED ===
[2020-04-08 15:56:51.866/0] ==> proxy.c:969 'p < end' is not true
[2020-04-08 15:56:51.867/0] (forcing SIGSEGV to print the bug report.)
[2020-04-08 15:56:51.869/0] Thread 1 terminated
[2020-04-08 15:56:51.869/0] Thread 2 terminated
[2020-04-08 15:56:51.870/0] Thread 3 terminated
[2020-04-08 15:56:51.870/0] Thread 4 terminated
[2020-04-08 15:56:51.870/0] Thread 5 terminated
[2020-04-08 15:56:51.871/0] Thread 6 terminated
[2020-04-08 15:56:51.872/0] Thread 7 terminated
[2020-04-08 15:56:51.872/0] Redis Cluster Proxy 0.9.102 crashed by signal: 11
[2020-04-08 15:56:51.873/0] Crashed running the instruction at: 0x7f1f51010040
[2020-04-08 15:56:51.873/0] Accessing address: 0xffffffffffffffff
[2020-04-08 15:56:51.873/0] Handling crash on thread: 0
[2020-04-08 15:56:51.874/0] Failed assertion: p < end (proxy.c:969)

------ STACK TRACE ------
EIP:
./redis-cluster-proxy(_proxyAssert+0x70)[0x7f1f51010040]

Backtrace:
./redis-cluster-proxy(logStackTrace+0x44)[0x7f1f5100f5a4]
./redis-cluster-proxy(sigsegvHandler+0x1a0)[0x7f1f5100fd00]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f1f50442890]
./redis-cluster-proxy(_proxyAssert+0x70)[0x7f1f51010040]
./redis-cluster-proxy(proxyCommand+0x1972)[0x7f1f5101be52]
./redis-cluster-proxy(processRequest+0x347)[0x7f1f5101e1b7]
./redis-cluster-proxy(readQuery+0x21e)[0x7f1f5101f45e]
./redis-cluster-proxy(aeProcessEvents+0x14f)[0x7f1f5100aadf]
./redis-cluster-proxy(aeMain+0x2b)[0x7f1f5100aeeb]
./redis-cluster-proxy(+0x1395c)[0x7f1f5101395c]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f1f504376db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f1f5015188f]

------ INFO OUTPUT ------

Proxy

proxy_version:0.9.102
proxy_git_sha1:00000000
proxy_git_dirty:0
proxy_git_branch:
os:Linux 4.4.0-19041-Microsoft x86_64
arch_bits:64
multiplexing_api:epoll
gcc_version:7.3.0
process_id:344
threads:8
tcp_port:7777
uptime_in_seconds:5255
uptime_in_days:0
config_file:
acl_user:default

Memory

used_memory:8333440
used_memory_human:7.95M
total_system_memory:68650504192
total_system_memory_human:63.94G

Clients

connected_clients:1
max_clients:10000
thread_0_clinets:1
thread_1_clinets:0
thread_2_clinets:0
thread_3_clinets:0
thread_4_clinets:0
thread_5_clinets:0
thread_6_clinets:0
thread_7_clinets:0

Cluster

address:
entry_node::0

---- SIZEOF STRUCTS ----
clientRequest: 184
client: 224
redisClusterConnection: 48
clusterNode: 112
redisCluster: 104
list: 48
listNode: 24
rax: 24
raxNode: 4
raxIterator: 480
aeEventLoop: 88
aeFileEvent: 32
aeTimeEvent: 64

------ REGISTERS ------

RAX:0000000000000000 RBX:00000000000003c9
RCX:0000000000000b40 RDX:0000000000000000
RDI:00007f1f5041c760 RSI:00007f1f5041d8c0
RBP:00007f1f5102f518 RSP:00007f1f5001fc90
R8 :00007f1f5041d8c0 R9 :00007f1f50020700
R10:00000000ffffffba R11:0000000000000000
R12:00007f1f5102f9d8 R13:00007f1f4403dc80
R14:0000000000000001 R15:00007f1f4403dc80
RIP:00007f1f51010040 EFL:0000000000010202
CSGSFS:00000053002b0033
(00007f1f5001fc9f) -> 00007fffe5808090
(00007f1f5001fc9e) -> 0000000000000000
(00007f1f5001fc9d) -> 18298df602aa7900
(00007f1f5001fc9c) -> 0000000000050200
(00007f1f5001fc9b) -> 00007f1f5102698f
(00007f1f5001fc9a) -> 0000000000000007
(00007f1f5001fc99) -> 0000000000000005
(00007f1f5001fc98) -> 0000000000000000
(00007f1f5001fc97) -> 0000000000000007
(00007f1f5001fc96) -> 000000764400a7d3
(00007f1f5001fc95) -> 0000000000008006
(00007f1f5001fc94) -> 0000000000000019
(00007f1f5001fc93) -> 00007f1f5101be52
(00007f1f5001fc92) -> 00007f1f5102fadb
(00007f1f5001fc91) -> 00007f1f44071e47
(00007f1f5001fc90) -> 00007f1f4403d0f1

------ DUMPING CODE AROUND EIP ------
Symbol: _proxyAssert (base: 0x7f1f5100ffd0)
Module: ./redis-cluster-proxy (base 0x7f1f51000000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=0x7f1f5100ffd0 -D -b binary -m i386:x86-64 /tmp/dump.bin

dump of function (hexdump of 240 bytes):
8b05cace220041544989fc554889f55389d385c07505e895f2ffff488d3581da0100bf0400000031c0e8e21a0000488d3587da01004d89e089d94889eabf0400000031c0e8c71a0000488d3590d70100bf0400000031c04c89256ace220048892d5bce2200891d4dcc2200e8a01a0000c60425ffffffff785b5d415cc30f1f0048b8feffffffffffff7f4154554839c65348bb0000000000000080771b4883fe04bb0400000076100f1f8400000000004801db4839de77f848395f18b80100000074394889fd488d3cdd000000004c8d63ffe80969010048837d1000742a4889453048895d3831c04c89654048c74548
Function at 0x7f1f51011ae0 is proxyLog
Function at 0x7f1f510269b0 is zcalloc

=== PROXY BUG REPORT END. Make sure to include from START to END. ===

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.