Coder Social home page Coder Social logo

Comments (13)

Grokzen avatar Grokzen commented on May 25, 2024

@milyenpabo Have you looked at the unstable branch and the code in there if the logic is the same?

from redis-py-cluster.

milyenpabo avatar milyenpabo commented on May 25, 2024

@Grokzen As far as I can tell from the source, it has the same issue. I also ran a search in the repo for the ASKING command, but did not find anything relevant. Strangely, exceptions.py on the unstable branch has a new AskError class that describes the correct mechanism in a comment, but I found no other clues.

Do I miss something?

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

@milyenpabo I think the client can handle ASK errors now after i rewrote the code to use a better parser https://github.com/Grokzen/redis-py-cluster/blob/unstable/rediscluster/client.py#L260 but it could be that ASKING command is not sent when that happens.

from redis-py-cluster.

milyenpabo avatar milyenpabo commented on May 25, 2024

@Grokzen That's right, the exception is caught but ASKING is not sent, so in effect the ping-pong issue is still there. The difference in the unstable version is that it throws a different type of exception than in v1.0.0:

Traceback (most recent call last):
  File "redis-writer.py", line 33, in <module>
    r.set(d, d)
  File "build/bdist.linux-x86_64/egg/redis/client.py", line 1055, in set
  File "build/bdist.linux-x86_64/egg/rediscluster/utils.py", line 87, in inner
  File "build/bdist.linux-x86_64/egg/rediscluster/client.py", line 266, in execute_command
rediscluster.exceptions.ClusterError: TTL exhausted.

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

@milyenpabo I will dig some during the weekend and if all goes well i might have a patch that adds this feature if it needs to be there.

The only thing is that i have never seen commands or client that breaks when it is running during a live reshard. Both me and @72squared have very recently had a test script (with threads) that performs operations during a reshard and after our fixes (thread related) we have not yet seen any problems.

Have you seen a acctual problem/exception somehow when you have been running a reshard that we can use to reproduce the problem, or this more a "it should comply to the docs" issue?

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

About the change of error, see https://github.com/Grokzen/redis-py-cluster/blob/unstable/docs/Upgrading.md#100----next-release where it states that the error was changed due to the old one was not so descriptive.

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

Can you share the test script you have been running in a gist that i can access? Also a short description about the redis-cluster you are running, number of nodes, locally/cloud/docker? How much data do you have in the server and how many nodes are you moving around? With that i might be able to reproduce the error.

from redis-py-cluster.

milyenpabo avatar milyenpabo commented on May 25, 2024

@Grokzen I think the current implementation does not comply with the redis cluster specification. But anyhow, I can provide a test case. I have a redis-writer.py script that writes keys to a redis cluster (v3.0.4) with 3 masters and 1 replica per master. (See gist.)

from rediscluster import RedisCluster

startup_nodes = [{"host": "127.0.0.1", "port": "27002"}]
r = RedisCluster(startup_nodes=startup_nodes, max_connections=32, decode_responses=True)

N = 1000000
p = 0
pdiff = 1

print "Writing %d keys to redis" % N

for i in xrange(N):
    d = str(i)
    r.set(d, d)
    progress = 100.0*i/N
    if progress >= p:
        print "%.0f%%" % progress,
        p += pdiff
if progress < 100:
    print "100%"

First I run the script to populate redis:

python -u redis-writer.py

Then I re-run the script while resharding redis with the following command:

./redis-trib.rb reshard --from 412072cda05b27485348f3b83a5b5b09ba01b1ce --to 2f51725948ca46649048a05f4c4ac0bfa605bad5 --slots 2000 --yes 127.0.0.1:27002

If the number of slots to migrate (in my case 2000) and the number of keys/slot (in my case ~60) is large enough, the script will fail with a probability close to 100%. The above mentioned exception is thrown when the writer script hits a key that was already migrated, but the slot containing the key is still under migration. (For explanation see the issue-opening post.)

Please note that the error is not triggered deterministically. You probably did not see this happening because in your tests you did not hit an already moved key in a slot just under migration. So depending on your setup, you might need to re-run the test several times or tweak the parameters (number of slots to reshard and/or the number of keys/slot, i.e. the N variable in the script).

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

very nice. I will play with it some during the weekend and see what i can come up with.

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

@milyenpabo Do you think that only this will be enough? 4bf8e51

I am not really sure how to verify it works as intended more then that it do not break anything and your script still works as expected during the reshard operation. I do not however see the exception you got after that fix has been applied.

It feels stable enough for me to merge this.

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

@milyenpabo I think the fix i pushed to unstable branch have fixed the problem. I compared to jedis so the logic would look the same and it does. If you think there is still problems with the implementation, you can reopen this issue and i will take another look at the problem.

from redis-py-cluster.

milyenpabo avatar milyenpabo commented on May 25, 2024

@Grokzen I think that the fix is OK. I also made some tests, and now redirection seems to work fine. Thanks!

from redis-py-cluster.

Grokzen avatar Grokzen commented on May 25, 2024

@milyenpabo Another fix was just commited because the original implementation was not enough. It was found that it was still failing on some ASK errors. See commit 199ee0b

from redis-py-cluster.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.