Coder Social home page Coder Social logo

grokzen / redis-py-cluster Goto Github PK

View Code? Open in Web Editor NEW
1.1K 52.0 317.0 1.15 MB

Python cluster client for the official redis cluster. Redis 3.0+.

Home Page: https://redis-py-cluster.readthedocs.io/

License: MIT License

Makefile 2.51% Python 87.22% Ruby 10.27%
redis redis-cluster redis-cluster-client redis-client python python3

redis-py-cluster's Introduction

redis-py-cluster EOL

In the upstream package redis-py that this librar extends, they have since version * 4.1.0 (Dec 26, 2021) ported in this code base into the main branch. That basically ends the need for this package if you are using any version after that release as it is natively supported there. If you are upgrading your redis-py version you should plan in time to migrate out from this package into their package. The move into the first released version should be seamless with very few and small changes required. This means that the release 2.1.x is the very last major release of this package. This do not mean that there might be some small support version if that is needed to sort out some critical issue here. This is not expected as the development time spent on this package in the last few years have been very low. This repo will not be put into a real github Archive mode but this repo should be considered in archive state.

I want to give a few big thanks to some of the people that has provided many contributions, work, time and effort into making this project into what it is today. First is one of the main contributors 72Squared and his team who helped to build many of the core features and trying out new and untested code and provided many optimizations. The team over at AWS for putting in the time and effort and skill into porting over this to redis-py. The team at RedisLabs for all of their support and time in creating a fantastic redis community the last few years. Antirez for making the reference client which this repo was written and based on and for making one of my favorite databases in the ecosystem. And last all the contributions and use of this repo by the entire community.

redis-py-cluster

This client provides a client for redis cluster that was added in redis 3.0.

This project is a port of redis-rb-cluster by antirez, with a lot of added functionality. The original source can be found at https://github.com/antirez/redis-rb-cluster

Build Status Coverage Status PyPI version

The branch master will always contain the latest unstable/development code that has been merged from Pull Requests. Use the latest commit from master branch on your own risk, there is no guarantees of compatibility or stability of non tagged commits on the master branch. Only tagged releases on the master branch is considered stable for use.

Python 2 Compatibility Note

This library follows the announced change from our upstream package redis-py. Due to this, we will follow the same python 2.7 deprecation timeline as stated in there.

redis-py-cluster 2.1.x will be the last major version release that supports Python 2.7. The 2.1.x line will continue to get bug fixes and security patches that support Python 2 until August 1, 2020. redis-py-cluster 3.0.x will be the next major version and will require Python 3.5+.

Documentation

All documentation can be found at https://redis-py-cluster.readthedocs.io/en/master

This Readme contains a reduced version of the full documentation.

Upgrading instructions between each released version can be found here

Changelog for next release and all older releases can be found here

Installation

Latest stable release from pypi

$ pip install redis-py-cluster

This major version of redis-py-cluster supports redis-py >=3.0.0, <4.0.0.

Usage example

Small sample script that shows how to get started with RedisCluster. It can also be found in examples/basic.py

>>> from rediscluster import RedisCluster

>>> # Requires at least one node for cluster discovery. Multiple nodes is recommended.
>>> startup_nodes = [{"host": "127.0.0.1", "port": "7000"}, {"host": "127.0.0.1", "port": "7001"}]
>>> rc = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)

# Or you can use the simpler format of providing one node same way as with a Redis() instance
<<< rc = RedisCluster(host="127.0.0.1", port=7000, decode_responses=True)

>>> rc.set("foo", "bar")
True
>>> print(rc.get("foo"))
'bar'

License & Authors

Copyright (c) 2013-2021 Johan Andersson

MIT (See docs/License.txt file)

The license should be the same as redis-py (https://github.com/andymccurdy/redis-py)

redis-py-cluster's People

Contributors

72squared avatar akrylysov avatar alan-yilun-li avatar alisaifee avatar angusp avatar artiom avatar astrohsy avatar awestendorf avatar dan-blanchard avatar davidjfelix avatar diogodafiti avatar dkent avatar dobrite avatar eshyong avatar etng avatar evanpurkhiser avatar ewdurbin avatar grokzen avatar imnotjames avatar jeffwidman avatar jkklee avatar khersey avatar klaussfreire avatar mattrobenolt avatar mc3ander avatar monklof avatar mumumu avatar pcmanticore avatar svrana avatar vascovisser avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

redis-py-cluster's Issues

Drop support for python 3.2

In release 1.2.0 (The next after the one that is in unstable right now that will be 1.1.0) python 3.2 support will be dropped.

Reason for this is that python 3 got really good at python 3.3 and python 3.2 blocks and cause alot of problems when trying to support py2 & py3 at the same time. One good example is the unicode litteral keyword u"string"was introduced in python 3.3.

Implement Redis() protocol

The current client is based on StrictRedis and that protocol.

In this task the other client Redis class should be implemented in this cluster lib with the same protocol setup/support as the normal Redis client provides.

Hash key containing ':' throws error

If a hash key contains the character ':' then the following error is thrown under Python 2.7.9:

Traceback (most recent call last):
  File "redis-cluster-bug.py", line 21, in <module>
    rc.hmset(key, value_dict)
  File "/Users/sebastian/.virtualenvs/sea/lib/python2.7/site-packages/redis/client.py", line 1872, in hmset
    return self.execute_command('HMSET', name, *items)
  File "/Users/sebastian/.virtualenvs/sea/lib/python2.7/site-packages/rediscluster/utils.py", line 82, in inner
    return func(*args, **kwargs)
  File "/Users/sebastian/.virtualenvs/sea/lib/python2.7/site-packages/rediscluster/client.py", line 301, in execute_command
    action = self.handle_cluster_command_exception(e)
  File "/Users/sebastian/.virtualenvs/sea/lib/python2.7/site-packages/rediscluster/client.py", line 154, in handle_cluster_command_exception
    raise e
redis.exceptions.ResponseError: WRONGTYPE Operation against a key holding the wrong kind of value

The same call is actually working when I use the redis-py library. I also checked to do the call directly in Redis-CLI and it works there, too. It seems you somewhere split the response using the ':' as separator or so. But I could not find it in the codebase.

And a sample code which shows the issue (1 key works, 1 not):

#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
shows a bug in redis-py-cluster which happens
if a hash name contains the ':' character
"""
cluster_nodes = [{"host": "127.0.0.1", "port": "30001"},]

from rediscluster import RedisCluster
rc = RedisCluster(startup_nodes=cluster_nodes, decode_responses=True)

value_dict = {'name': "Peter Meyer", "email": "[email protected]", "age": 35}

for i in xrange(1000):

    key = str("test:foo:%d" % i) # breaks hmset call
    key = str("testfoo%d" % i) # works

    rc.hmset(key1, value_dict)

    resp = rc.hgetall(key)
    print "%s = %s " % (key, resp)

Is it easy to fix?

Thanks for bringing Python 3 Cluster support to Python!

Best regards,

Sebastian

ipython not found StrictRedisCluster

I used pip install, that version could not import "StrictRedisCluster"

bug git clone your redis-py-cluster;python setup.py install , fix the question。

Please update redis-py-cluster in pypi

pipe.mget(...) calls behave differently from redis-py

Consider the following code sample:

r['a'] = 1
r['b'] = 2
pipe = r.pipeline(transaction=False)
pipe.mget(['a', 'b'])
print pipe.execute()

Here's the result from redis-py:

[['1', '2']]

In contrast, here is what we get with redis-py-cluster:

['1', '2']

This is because the mget command in the pipe calls execute_command in a loop, queueing up the commands.

After studying the problem for a bit, I'm wondering if we should start by just disabling all multi-key non-atomic calls in pipe. The pipeline object in redis-py-cluster already behaves somewhat differently than native redis-py and if you use the pipeline object you are going for performance so a little refactoring isn't hard to do. All the multi-command operations can easily be rewritten with no unexpected results. This example will work the same in redis-py and redis-py-cluster:

pipe = r.pipeline(transaction=False)
pipe.get('a')
pipe.get('b')
print pipe.execute()

The example above will return the same in both redis and redis-py-cluster:

['1', '2']

It will also make it much easier to write parallel execution of packed commands when we get to that step. (I've decided to finish writing up unit tests for pipeline before beginning the refactor).

If you agree, I can pretty easily create a patch to disable all the special multi-key commands in redis-py-cluster when used in pipeline.

Tests break if the test cluster have been resharded

When investigating the ASK errors and running all the resharding operations back and forth. It was discovered that some tests are very dependent on the redis cluster to be a certain way. By this i mean that each slot has to bee on the correct node because tests assume that when hashing a key it should be put in a certain slot that should be on a certain node.

A solution that might be worth to investigate is to create a mock client that can be configured so that the real cluster can be ignored and tests will allways pass no matter what the state of the cluster is in.

If you ever get broken tests of this kind. The simplest fix for now it to restart/reset your cluster and to reset all slots back to a initial 3 node configuration.

How to use pipeline when all commands are guaranteed to be in 1 slot/server ?

Hi,

I think sharding has to be considered really carefully, which I have, and my pipeline commands are guaranteed to go to one server/slot. Is there a way I can crate a pipeline where I guarantee that all commands will go to 1 server and no logic is executed that will group the commands by node? I saw docs and code but couldn't find anything.

Thank you

ImportError: cannot import name dictkeys on Python 2.7.3

using python 2.7.3
using redis-py-cluster 0.2.0

when trying to do a:
from rediscluster import RedisCluster

or

from rediscluster import StrictRedisCluster

receiving:

from rediscluster import RedisCluster
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/rediscluster-0.5.3-py2.7.egg/rediscluster/init.py", line 12, in
from rediscluster.cluster_client import StrictRedisCluster
File "/usr/local/lib/python2.7/dist-packages/rediscluster-0.5.3-py2.7.egg/rediscluster/cluster_client.py", line 5, in
from redis._compat import (
ImportError: cannot import name dictkeys

StrictRedisCluster().info() not returning all nodes

I'm using INFO to get usage stats for each of the nodes in my Redis Cluster, but it seems that .info() always fails to return details for a specific node (a slave node).

Using RedisClusterMgt().info() returns the details of all six nodes in the cluster and redis-cli reports the same data. Running INFO on the node which is always excluded from StrictRedisCluster().info() returns the expected stats: running as a slave in the correct cluster.

The missing node is not included in my startup_nodes (only the 3 masters).

I'm guessing this could be a bug in NodeManager, but I'd appreciate a pointer or two before I dive into trying to debug the problem.... Thanks!

refactor pipeline send_cluster_commands

This method is a huge unwieldy monster. If we get rid of threads it might shrink a bit but even still it'll be nasty. I'd like to take a stab at moving this logic into sub-tasks or possibly separate classes to ease understanding and breaking up the logic into more manageable chunks. However, I don't want to make any sacrifices on performance or break expectations on behavior so it'll need to be heavily tested.

Why is `publish` blocked in pipeline mode?

Hi, I appreciate the work on this great Redis Cluster client library.

I read your doc about the idiosyncrasies of pubsub on Redis Cluster, but I don't understand why publish is blocked on StrictRedisPipeline. Could you provide some insight?

Thanks!

RedisCluster ERROR with authentication

I use redis3.0.5 cluster mode.
Here are my code:

from rediscluster import RedisCluster
startup_nodes = [{"host": "192.168.1.2", "port": 7000}]
r = RedisCluster(startup_nodes=startup_nodes, password='py02', decode_responses=True)
print r.exists("key")

when i run this code, it throws an exception:
rediscluster.exceptions.RedisClusterException: ERROR sending 'cluster slots' command to redis server: {'host': '192.168.1.2', 'port': 7000}

Last i found it in rediscluster\nodemanager.py get_redis_link() method

why can't use password? I hope you can explain

Thank you very much!

Create more benchmarks

The only one that exists right now is not that good and needs some work.

There should exist benchmarks for some basic commands and pipelines and they should include the exact same test but using redis-py so a good comparison can be done.

Too many Cluster redirections

when I tested in 127.0.0.1 ,everything is ok.
but when I tested in remote machine, it got this error every time.

Traceback (most recent call last):
  File "test_redis.py", line 22, in <module>
    main()
  File "test_redis.py", line 18, in main
    print rediscluster.set('foo', 'a')
  File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 1055, in set
    return self.execute_command('SET', *pieces)
  File "/usr/local/lib/python2.7/dist-packages/rediscluster/utils.py", line 82, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rediscluster/client.py", line 307, in execute_command
    raise RedisClusterException("Too many Cluster redirections")
rediscluster.exceptions.RedisClusterException: Too many Cluster redirections

my test code:

# -*- coding: utf-8 -*
import os,sys,time,traceback

import redis
from rediscluster import RedisCluster

def main():
    serverip='10.5.79.60'
    startup_nodes=[{"host": serverip,"port": i} for i in xrange(7000, 7006)]
    try:
        rediscluster = RedisCluster(startup_nodes=startup_nodes)
    except Exception, err:
        print 'failed to connect cluster'
        sys.exit(0)

    for i in xrange(1000):
        print rediscluster.set('foo', 'a')

if __name__=='__main__':
    main()

I used pip install, that version could not import "StrictRedisCluster"
from rediscluster import StrictRedisCluster

How to use the benchmark script "simple.py" ? thks

This's doesn't work.. I don't now why..

# ./simple.py --host 10.11.80.117 --port 7000 --timeit --pipeline
Usage:
  simple [--host IP] [--port PORT] [--nocluster] [--timeit] [--pipeline] [--resetlastkey] [-h] [--version]
Options:
  --nocluster        If flag is set then StrictRedis will be used instead of cluster lib
  --host IP          Redis server to test against [default: 127.0.0.1]
  --port PORT        Port on redis server [default: 7000]
  --timeit           run a mini benchmark to test performance
  --pipeline         Only usable with --timeit flag. Runs SET/GET inside pipelines.
  --resetlastkey     reset __last__ key
  -h --help          show this help and exit
  -v --version       show version and exit

Object encoding fails with ClusterError: TTL exhausted error

From 1.1.0 version. From time to time my test script fails with ClusterError: TTL exhausted error with code like this:

 key_with_types = []
for key in ret:
      obj_encoding = self.redis.object(infotype='ENCODING', key=key)
      obj_type = self.redis.type(key)
       key_with_types.append({'type': obj_type, 'encoding': obj_encoding})

This happens always with any values of reinitialize_steps. From library source i see that execution tree in client.py always goes with that case then i got this error:
c6703-clip-81kb

  1. Seng Object encoding command
  2. Catch MoveError
  3. Select redirect_addr
  4. Try again

Actually redirect_addr will never be never used in this while loop. The AskError will never be catched becouse of asking variable will never be True in this loop.

The dirty fix (and i do not know is this correct) is set asking=True if refresh_table_asap happens.

refactor pipeline threads to experiment with parallel write/read pattern

We might not really need threads in pipelines if we first write to all the sockets for each node, sending the commands and wait to read the responses until all the writes have been sent first. This isn't perfect non-blocking i/o in the sense that you wouldn't be reading partial responses in parallel from the sockets, but in our case it may be good enough and in most cases might be even more performant than running a bunch of parallel threads, especially when the pipeline request maps to many many different nodes. Worth doing a prototype and building some performance benchmark comparisons.

subscribe fails with ConnectionError

Subscribe consistently fails with ConnectionError. Example code snippet:

rc = RedisCluster(startup_nodes=[{'host': '10.0.0.1', 'port': '7001'}])
pubsub = rc.pubsub()
pubsub.subscribe(**{'mychannel': mycallback})

def mycallbak(msg):
    pass

The problem is that rc.pubsub() returns a PubSub object (from StrictRedis), which is not aware of the cluster node addresses. Thus on rc.subscribe() it tries to connect to a default Redis endpoint on localhost:6379, which fails with the following error:

redis.exceptions.ConnectionError: Error 111 connecting to localhost:6379. Connection refused.

RedisCluster Too Many Redirects error.

Hello @Grokzen,

We have an issue dealing with docker-redis-cluster setup using redis-py-cluster library.

We are using the following code to connect to the redis setup and set a key:

startup_nodes = [{"host": "172.22.1.96", "port": "7000"}]
rc = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)
rc.set("a", 3)
print rc.get("a")

When executing this code, it throws an error on (rc.set("a",3)) line of code, the error is as following:

    File "/home/masmar/ws/lpcon/tests/redis_test.py", line 12, in <module>
      rc.set("a", 3)
    File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 1055, in set
     return self.execute_command('SET', *pieces)
    File "/usr/local/lib/python2.7/dist-packages/rediscluster/rediscluster.py", line 400, in   execute_command
    return self.send_cluster_command(*args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/rediscluster/rediscluster.py", line 321, in      send_cluster_command
    raise Exception("To many Cluster redirections?")
   Exception: To many Cluster redirections?

When debugging this, we found that it makes a connection to each node in the cluster sequentially, trying to set the key on each node but it fails every time (because of (error) MOVED <slot#> <IP_Address>:) and return the error above eventually.

Please note that we are running our python test app from a remote server, but when we use redis-cli from the redis local machine's console to set keys it works perfectly.

Can you please help on this? Are we using the library in a wrong way? Please let me know if you need more details.

Regards,

Refactor Nodemanager and Node handling

It needs some love and some quirks needs to be fixed.

For example the ._node attribute on connection objects should be looked at.

The name key in node object should not be needed.

Possibly make a Node class to use instead of tracking the node via a dict.

Client fails during live resharding

It seems that resharding is handled incorrectly up to v1.0.0. The relevant source code is the execute_command function in client.py.

The problem is the following: if a redis-py-cluster client is querying a slot while it is under migration from node A to node B, then the client will be ping-ponging between A and B until RedisClusterRequestTTL is exhausted and an exception is thrown. (Node A will repeatedly redirect the client to B with an ASK reply, while B will repeatedly redirect to A with a MOVED reply.)

The solution for this situation is the ASKING cluster command, which is completely missing from redis-py-cluster, if I'm right. Some explanations on this mechanism:

http://redis.io/commands/cluster-setslot
http://grokbase.com/p/gg/redis-db/142wrajgdq/cluster-questions

In case of an ASK redirection the client should not blindly issue the command to the new target node, but first send an ASKING command, notifying node B that it was already redirected from the authoritative slot owner, A.

This issue is related to issue #67.

Implement all basic clustermgt commands in client

Currently there is a ClusterMgt class that implements some nice functions to deal with a cluster. However the basic commands is not implemented in the client class. All cluster mgt methods should be implemented in the base client and the ClusterMgt class can be used to provide a better use and interface to deal with the cluster.

Make the slots cache validation less strict.

@72squared Pointed out that the current slots cache validation is to strict and have a high chanse of failing if a client tries to build the cache during a resharding operation.

while True:
    time.sleep(0.1)
    r.get('__test{%s}' % uuid.uuid4())
    sys.stdout.write('.')
    sys.stdout.flush()
  File "test_redis_health.py", line 16, in <module>
    r.get('__test{%s}' % uuid.uuid4())
  File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 863, in get
    return self.execute_command('GET', name)
  File "/usr/local/lib/python2.7/dist-packages/rediscluster/utils.py", line 82, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rediscluster/client.py", line 293, in execute_command
    self.connection_pool.nodes.initialize()
  File "/usr/local/lib/python2.7/dist-packages/rediscluster/nodemanager.py", line 122, in initialize
    raise RedisClusterException("startup_nodes could not agree on a valid slots cache. {} vs {} on slo: {}".format(self.slots[i], master_addr, i))
rediscluster.exceptions.RedisClusterException: startup_nodes could not agree on a valid slots cache. {'host': u'127.0.0.1', 'server_type': 'master', 'port': 7004L, 'name': '127.0.0.1:7004'} vs {'host': u'', 'server_type': 'master', 'port': 7004L, 'name': ':7004'} on slo: 5461
vagrant@local:/vagrant/sites/marcopolo/scripts$

A good implementation to follow is how Jedis do https://github.com/xetorthio/jedis/blob/3495402b44dfee4c4625b99d9d81cf63e0b0df96/src/main/java/redis/clients/jedis/JedisClusterCommand.java#L84

Question

Is this project still active? Are you accepting pull requests / additional contributors? Are you intending for to re-implement all of the functionality in redis-trib.rb? For example, add_node, remove_node, and reshard?

Improve pipelines

There is alot of work that needs to be done to make pipelines work at the same level as the normal client.

  • Work to unblock the basic commands that is currently blocked (mget, mset etc...)
  • Take another look at NYI commands like immediate_execute_command & _execute_transaction to see if they can be implemented.
  • Split send_cluster_commands into smaller parts that can be reused more easily.

Release 0.2.0

This ticket should list and track all changes that is targeted for next release.

  • New pubsub implementation #38
  • Major refactoring of RedisCluster class to make it work more like the connection pool works in redis-py
  • Create a better Node manager class
  • New pipeline implementation that should support parallel execution [@72squared]
  • Host/port kwarg for RedisCluster should be usable

Random unicode errors seen on TravisCI

The following errors have been observed on TravisCI when running tests. They are random and do not happen every run.

=================================== FAILURES ===================================
_________________ TestPubSubAutoDecoding.test_pattern_publish __________________
self = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>
r = RedisCluster<127.0.0.1:7000,127.0.0.1:7001,127.0.0.1:7002,127.0.0.1:7003,127.0.0.1:7004,127.0.0.1:7005>
    def test_pattern_publish(self, r):
        p = r.pubsub(ignore_subscribe_messages=True)
        p.psubscribe(self.pattern)
        r.publish(self.channel, self.data)
>       assert wait_for_message(p) == self.make_message('pmessage',
                                                        self.channel,
                                                        self.data,
                                                        pattern=self.pattern)
E       assert None == {'channel': 'uniᅨcode', 'data': 'abcᅪ123', 'pattern': 'uniᅨ*', 'type': 'pmessage'}
E        +  where None = wait_for_message(<rediscluster.rediscluster.ClusterPubSub object at 0x2b3b24247cd0>)
E        +  and   {'channel': 'uniᅨcode', 'data': 'abcᅪ123', 'pattern': 'uniᅨ*', 'type': 'pmessage'} = <bound method TestPubSubAutoDecoding.make_message of <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>>('pmessage', 'uniᅨcode', 'abcᅪ123', pattern='uniᅨ*')
E        +    where <bound method TestPubSubAutoDecoding.make_message of <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>> = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>.make_message
E        +    and   'uniᅨcode' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>.channel
E        +    and   'abcᅪ123' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>.data
E        +    and   'uniᅨ*' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24247690>.pattern
tests/test_pubsub.py:350: AssertionError
----------------------------- Captured stdout call -----------------------------
StrictRedis<ConnectionPool<Connection<host=127.0.0.1,port=7003,db=0>>>
_____________ TestPubSubAutoDecoding.test_channel_message_handler ______________
self = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>
r = RedisCluster<127.0.0.1:7000,127.0.0.1:7001,127.0.0.1:7002,127.0.0.1:7003,127.0.0.1:7004,127.0.0.1:7005>
    def test_channel_message_handler(self, r):
        p = r.pubsub(ignore_subscribe_messages=True)
        p.subscribe(**{self.channel: self.message_handler})
        r.publish(self.channel, self.data)
        assert wait_for_message(p) is None
>       assert self.message == self.make_message('message', self.channel,
                                                 self.data)
E       assert None == {'channel': 'uniᅨcode', 'data': 'abcᅪ123', 'pattern': None, 'type': 'message'}
E        +  where None = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>.message
E        +  and   {'channel': 'uniᅨcode', 'data': 'abcᅪ123', 'pattern': None, 'type': 'message'} = <bound method TestPubSubAutoDecoding.make_message of <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>>('message', 'uniᅨcode', 'abcᅪ123')
E        +    where <bound method TestPubSubAutoDecoding.make_message of <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>> = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>.make_message
E        +    and   'uniᅨcode' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>.channel
E        +    and   'abcᅪ123' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2b3b24256b10>.data
tests/test_pubsub.py:360: AssertionError
----------------------------- Captured stdout call -----------------------------
StrictRedis<ConnectionPool<Connection<host=127.0.0.1,port=7004,db=0>>>
========= 2 failed, 217 passed, 17 xfailed, 2 xpassed in 20.64 seconds =========
ERROR: InvocationError: '/home/travis/build/Grokzen/redis-py-cluster/.tox/hi27/bin/python /home/travis/build/Grokzen/redis-py-cluster/.tox/hi27/bin/coverage run --source rediscluster -p -m py.test'
hi32 create: /home/travis/build/Grokzen/redis-py-cluster/.tox/hi32


=================================== FAILURES ===================================
_____________ TestPubSubAutoDecoding.test_pattern_message_handler ______________
self = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>
r = RedisCluster<127.0.0.1:7000,127.0.0.1:7001,127.0.0.1:7002,127.0.0.1:7003,127.0.0.1:7004,127.0.0.1:7005>
    def test_pattern_message_handler(self, r):
        p = r.pubsub(ignore_subscribe_messages=True)
        p.psubscribe(**{self.pattern: self.message_handler})
        r.publish(self.channel, self.data)
        assert wait_for_message(p) is None
>       assert self.message == self.make_message('pmessage', self.channel,
                                                 self.data,
                                                 pattern=self.pattern)
E       assert None == {'channel': 'uniᅨcode', 'data': 'abcᅪ123', 'pattern': 'uniᅨ*', 'type': 'pmessage'}
E        +  where None = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>.message
E        +  and   {'channel': 'uniᅨcode', 'data': 'abcᅪ123', 'pattern': 'uniᅨ*', 'type': 'pmessage'} = <bound method TestPubSubAutoDecoding.make_message of <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>>('pmessage', 'uniᅨcode', 'abcᅪ123', pattern='uniᅨ*')
E        +    where <bound method TestPubSubAutoDecoding.make_message of <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>> = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>.make_message
E        +    and   'uniᅨcode' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>.channel
E        +    and   'abcᅪ123' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>.data
E        +    and   'uniᅨ*' = <tests.test_pubsub.TestPubSubAutoDecoding object at 0x2ba8551d27b8>.pattern
tests/test_pubsub.py:377: AssertionError
----------------------------- Captured stdout call -----------------------------
StrictRedis<ConnectionPool<Connection<host=127.0.0.1,port=7005,db=0>>>
========= 1 failed, 218 passed, 16 xfailed, 3 xpassed in 23.65 seconds =========
ERROR: InvocationError: '/home/travis/build/Grokzen/redis-py-cluster/.tox/py34/bin/python /home/travis/build/Grokzen/redis-py-cluster/.tox/py34/bin/coverage run --source rediscluster -p -m py.test'

scan_iter returns duplicates

I am having 1M+ entries in Redis-Cluster and using following command:

for key in redis_conn.scan_iter(match='XYZ*', count=1000)

This is returning duplicates.

After Publish connections are not returned to the pool

When i use the pubsub mechanism inside redis, after a message gets published on a channel the connections tend to hog on one particular node in the cluster therby giving me this error
Error 24 connecting to 127.0.0.1:7006. Too many open files.
My cluster has 3 masters and 3 slaves.. Im attaching a small example script i used to test this behavior [example.py]. I checked the clients "redis-cli -c -p 7006 client list | wc -l"
Another peculiar thing i noticed was when i do
for f in seq 1 1000000;do echo $f;redis-cli -c -p 7006 publish channel $f;done
the connections are returned and
example.txt

no connection hogging is seen

`StrictClusterPipeline` `raise_on_error` does not raise anything besides `ResponseError`.

With regard to v1.1.0, I noticed some strange behavior when working on a client reconnection scheme. Whereas I fully expected any ConnectionError to be raised from a pipeline execution, I instead quietly received an array of exactly as many ConnectionError instances as pipeline commands, with no exceptions raised. There is a flag, raise_on_error, with default True in StrictClusterPIpeline.execute(), is passed along to .send_cluster_commands() (which also defaults the flag to True), and is finally used to determine whether or not to raise an exception before returning.

An exception is actually raised by .raise_first_error(), which has the form:

def raise_first_error(self, commands, response):
    for i, r in enumerate(response):
        if isinstance(r, ResponseError):
            self.annotate_exception(r, i + 1, commands[i][0])
            raise r

It seems to me that an exception ultimately needs to be raised here if any of the response elements are instances of an appropriate child of Exception, whether the response element is a ResponseError or not. For example, redis.ConnectionError is not raised.

Thanks!

Remove redis-trib.rb

When redis-trib.rb inside the redis repo have support for password protected connections there is no need to keep the patched version inside this repo.

inaccurate example info in readme.md

as with redis-py-cluster 0.2.0, there is no StricRedisCluster class instead RedisCluster. So the Usage example section in readme.md should be updated.

I think we should also state the version of redis-py-cluster with which the example is illustrated.

>>> from rediscluster import StrictRedisCluster
>>> startup_nodes = [{"host": "127.0.0.1", "port": "7000"}]
>>> rc = StrictRedisCluster(startup_nodes=startup_nodes, decode_responses=True)
>>> rc.set("foo", "bar")
True
>>> rc.get("foo")
'bar'

updated to

>>> from rediscluster import RedisCluster
>>> startup_nodes = [{"host": "127.0.0.1", "port": "7000"}]
>>> rc = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)
>>> rc.set("foo", "bar")
True
>>> rc.get("foo")
'bar'

Implement thread based parallel pipeline execution

We currently have a gevent implementation of a parallel pipeline runner, but it have some limitations. The biggest one is that it only works on python 2.x.

This issue should result in a PR that will implement a more general way to do paralell pipeline execution that follow these requirements.

  • MUST work in 2.7 and 3.2-3.4, 3.1 is optional but is not required.
  • Probably Thread based
  • Should live in its own function
  • Controll of what paralell runner to use should be specefied in this order
    • kwarg variable to RedisCluster() that should be used as default
    • kwarg variable to RedisCluster.pipeline() that if set should overwrite default the above
    • kwarg variable to StrictClusterPipeline()
    • Not sure if, but maybe as optional kwarg to StrictClusterPipeline.execute_... methods
  • It should validate that the set pipeline method is available in the currnet running environment. That means it should fail with raised Exception if gevent is used on python 3.x for example.
  • Extract gevent code into own function and use same interface as this new impl.

Library doesn't handle non contiguous key set

when a redis cluster node is setted to handle non contiguous key sets block answers something unexpected by library like following:
[2394-<-a34b3670....], [2395-<-a34b3670....],[2396-<-a34b3670....], ....
@line 121 you do range_.split("-")
expecting some string like u'0-5460' but it's not.

an Exception is thrown for:
ValueError: too many values to unpack

Parallel execution of packed pipelined commands against multiple nodes

I am just starting to get a better understanding of the code, but it seems so far this client implementation of pipelining doesn't pack the commands to a node in a single request as it does in redis-py here:

https://github.com/andymccurdy/redis-py/blob/master/redis/client.py#L2445-L2447

It seems like you should be able to pack all the commands and send them to a specific node so long as each command that is queued up is being routed to a key, or in the case of an MGET, is rewritten into batches of MGET calls that route to a correct node. In addition, instead of a simple for loop to go through the commands, it seems like you could also use a thread pool to execute the batched commands against each node in parallel or at least up to a certain threshold of parallel execution.

I realize this significantly increases the complexity of the code but it seems like it could greatly improve performance of the redis-py-cluster library, bringing it back up to par with redis-py.

If this seems like a reasonable direction to go, I could look into taking a first stab at implementing it as a proof of concept.

socket_connect_timeout not respected when init_slot_cache=True (default)

Using :

  • redis 2.10.3
  • redis-py-cluster 1.1.0

I'm opening a connection to my Redis cluster using :
redis.StrictRedisCluster( startup_nodes=mynodes, decode_responses=True, socket_connect_timeout=0.1)

When reaching ClusterConnectionPool.init(), as init_slot_cache is True by default, a connection is attempted to all nodes in the cluster, bypassing the socket_connect_timeout or socket_timeout that is passed through connection_kwargs.

I was confronted to this bug when I was attempting to test whether a connection to Redis could be set-up, in a "status" API call to my application. When no server defined in my configuration mynodes could be reached, it took quite a long time to tell me no connection could be established.

Moreover, when using init_slot_cache=False, there's another bug in ClusterConnectionPool.get_master_node_by_slot(), because when there's no server that could be reached, self.nodes.slots is an empty list.

Fix pubsub & tests

Currently all pubsub tests are disabled because the current pubusb implementation is broken.

The two major reasons it is broken is:

  • The PUBLISH command not returning the correct number of clients a message was read by
  • Clients are not disconnected properly because commands are sent to a random server and it will cause lingering connections during tests.

PLEASE NOTE that pubsub is still usable but some things might not behave as they should.

This ticket should do the following

  • Create a better implementation where all clients talking to a cluster should use the same redis node to fix the problem with non disconnected clients and inconsitency with PUBLISH command.
  • Fix the tests to make them pass w/o any major modifications.

Connection Pool Bug

I have a six node cluster (3 masters, 3 slaves), all of them configured with maxclients 100. After publishing approximately 100 messages, the client application dies:

File "/usr/local/lib/python3.4/dist-packages/redis/client.py", line 1888, in publish
  return self.execute_command('PUBLISH', channel, message)
File "/usr/local/lib/python3.4/dist-packages/rediscluster/utils.py", line 93, in inner
  return func(*args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/rediscluster/client.py", line 230, in execute_command
  return self._execute_command_on_nodes(self.nodes_callbacks[command](self, command), *args, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/rediscluster/client.py", line 317, in _execute_command_on_nodes
  res[node["name"]] = self.parse_response(connection, command, **kwargs)
File "/usr/local/lib/python3.4/dist-packages/redis/client.py", line 577, in parse_response
  response = connection.read_response()
File "/usr/local/lib/python3.4/dist-packages/redis/connection.py", line 574, in read_response
  raise response
redis.exceptions.ResponseError: max number of clients reached

This looks like a connection pool bug:

$ netstat --tcp -a -n | grep TIME_WAIT | wc -l
100

This can be reproduced with this sample code or with this project.

Python 2.7.6 or Python 3.4.3, redis 2.10.3, rediscluster 1.1.0

The use of connections

Currently the cluster client maintains a batch of Redis.Connections in a Redis.ConnectionPool.

Is there a reason not to use the Redis object directly? Since it itself maintains a pool of connections.

With the current design, there is actually a bug when the connection is gone due to network issues. As a result, the connection cannot self-repair in such circumstance. I had a fix in my branch, which is to catch Timeout error and reset the connection pool. It's a dirty fix, but it works.

The better way to maintain the connections is probably like this: connections = {node: Redis(host, port)}.

Write porting doc

Write documentation on how to best port existing code to use this cluster lib

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.