The problem/use-case that the feature addresses Imp

I think also adding <div class="snippet-clipboard-content notranslate position-rel

Ok, so maybe something that would be make me "happyish": <div class="snippet-clipb

[NEW] Redis consistent cluster SCAN about valkey HOT 13 OPEN

madolson commented on August 15, 2024

[NEW] Redis consistent cluster SCAN

from valkey.

Comments (13)

NikolaBorisov commented on August 15, 2024

I think also adding

SCAN <cluster cursor> [SLOT X] [MATCH pattern] [COUNT count] [TYPE type]

is important because it lets you scan the cluster in parallel. CSCAN can not be parallelized. My use case for this was when I have a large cluster and I want to iterate over all the keys and change them somewhat, but I want to do this in parallel. The cluster could be so large that using CSCAN could take super long time. I think the right abstraction would be just to allow user to specify which slot they want to scan. It is very easy to build something that scans the whole cluster reliably if you have that.

from valkey.

madolson commented on August 15, 2024

Ok, so maybe something that would be make me "happyish":

CSCAN <cluster cursor> [SLOT X] [MATCH pattern] [COUNT count] [TYPE type]

Which just scans the given slot if it's provided. We can still have the be marked as a key, so that your client will route it for you. If you're really smart, you could reverse engineer a cursor that hits the right node and we could make some way to define that, but I think it makes sense to just have it be 0 or empty string.

There is a problem with my proposal. On the other thread, #4, means that it's not safe to do cross cluster scans without restarting the slot on moved, since the hashes in each databases aren't stable.

from valkey.

nihohit commented on August 15, 2024

does CSCAN <cluster cursor> [SLOT X] [MATCH pattern] [COUNT count] [TYPE type] mean that the cursor is only good for iterations with the same slot? what would be the cursor semantics be?

from valkey.

madolson commented on August 15, 2024

If SLOT is provided, it would only be valid for the SLOT specified. If it's omitted, it would do a scan across all slots in the cluster.

from valkey.

nihohit commented on August 15, 2024

If SLOT is provided, it would only be valid for the SLOT specified. If it's omitted, it would do a scan across all slots in the cluster.

I wonder whether a cursor might become "accidentally" usable between regular SCAN, CSAN, and CSCAN+slot calls, simply because its computed in the same way for each call.

The cursor would contain a component that includes a hashtag, to represent the slot it's currently scanning.

So, this means that a CSCAN might return a MOVED error if there was slot migration? If so, I think that it solves the issue well, but this requires a lot of heavy lifting from the cursor. For example, assuming CSCAN goes by order of slots, if a node contains slot 1 & 3 but not 2, CSCAN without slots will need to return the keys from slot 1, even if they're below COUNT, and then answer the next command with a MOVED to the node with slot 2, which will in turn respond with the keys in slot 2 and a MOVED back to the first node.
This allows the user to scan across the cluster, but it's not a great experience.

IMO this can be combined with a command that quickly (unlike CLUSTER NODES/SLOTS/SHARDS, which can be very slow in large, fragmented clusters) returns the slots available on the current node. Let's call it CLUSTER SLOTSLOCAL, and it only returns the slots available on the current node - no data on other nodes in the cluster.
That way the can pipeline CSCAN calls with CLUSTER SLOTSLOCAL without a significant perf penalty, and quickly know whether there was slot migration. Once the CSCAN calls on this node complete, the user knows exactly which slots were covered (if there wasn't any change), or retry with CSCAN SLOT slot_id for slots that were added during the process.

from valkey.

madolson commented on August 15, 2024

I wonder whether a cursor might become "accidentally" usable between regular SCAN, CSAN, and CSCAN+slot calls, simply because its computed in the same way for each call.

That's possible, and we could do that. The only concern is that with SCAN, the client doesn't expect it to need routing. We introduced the concept of "NOT A KEY" in Redis 7, that still requires routing.

This allows the user to scan across the cluster, but it's not a great experience.

Not quite. Let's simplify, there are 2 nodes (A and B) with 3 slots (A and 1 and 3, B has 2). All slots have 100 keys. The command ordering would look like:

B> CSCAN 0 COUNT 50 -> {slot:0}-0 and []
A> SCAN {slot:0}-0 COUNT 50-> {slot:0}-50 and [50 keys]
A> SCAN {slot:0}-50 COUNT 50-> {slot:1}-0 and [50 keys] // Notice how the slot was updated, we returned the remaining keys and the slot is empty.
B> SCAN {slot:1}-0 COUNT 50-> {slot:1}-50 and [50 keys]
B> SCAN {slot:1}-50 COUNT 50-> {slot:2}-0}and [50 keys]
A> SCAN {slot:2}-0 COUNT 50-> {slot:2}-50 and [50 keys]
A> SCAN {slot:2}-50 COUNT 50-> 0 and [50 keys] // We got a zero back, we're done!

At no point are we getting a moved, since we're routing based on the slot information, and the client knows that. You're right that if there are few overall keys, we might not have a very high density. We could optimize that by also including data from the next slot if the node has it though.

That way the can pipeline CSCAN calls with CLUSTER SLOTSLOCAL without a significant perf penalty, and quickly know whether there was slot migration. Once the CSCAN calls on this node complete, the user knows exactly which slots were covered (if there wasn't any change), or retry with CSCAN SLOT slot_id for slots that were added during the process.

The ask has just been to parallelize it, which you could still do. If you have like a million keys, we're over indexing on performance, since it'll finish fast. If you have 10 billion keys (~1 million keys per slot), then parallelization makes sense.

from valkey.

nihohit commented on August 15, 2024

Let's simplify, there are 2 nodes (A and B) with 3 slots (A and 1 and 3, B has 2)

This scenario works for a cluster that is stable, but what happens during slot migrations, or scale in/out?
For simplicity's sake, what happens if the client isn't aware of slot 2 moving from A to B? what would happen on a call A> SCAN {slot:1}-0 COUNT 50?
It seems like the correct response is either

{slot:2}-50 and [50 keys], since these are the next values on node A, or
MOVED B, since the next slot for the scan is in B

what would happen on a A> SCAN {slot:1}-0 COUNT 200 - where the count is larger than the number of entries in the slot? should A return {slot:3}-0 and [200 keys from slots 0 & 2, and implicitly skip slot 1? should it return {slot:1}-0 and [100 keys from slot 0], and under-provide on the COUNT in order to correctly reflect the missing slot?

Notice that in these examples the calls are without the [SLOT 1] argument - there's nothing explicitly requiring the queried node to contain slot 2.

Let's take a scenario in which the client only calls CSCAN, and doesn't perform any other operations - how would such a client correctly scan through a cluster undegoing changes?

from valkey.

madolson commented on August 15, 2024

if the client isn't aware of slot 2 moving from A to B

Then it'll get a MOVED message and try again. This is the normal behavior for misunderstanding the topology.

what would happen on a A> SCAN {slot:1}-0 COUNT 200 - where the count is larger than the number of entries in the slot? should A return {slot:3}-0 and [200 keys from slots 0 & 2, and implicitly skip slot 1

The current implementation only returns data from one slot at a time, which is our atomic unit, we would need to restart at the next slot. I mentioned an optimization, but I think you would probably want to opt in to it.

from valkey.

nihohit commented on August 15, 2024

The current implementation only returns data from one slot at a time

Oh, excellent. I didn't notice that this in the documentation. This solves my issues :)

from valkey.

avifenesh commented on August 15, 2024

Hope Im not repeating something, couldn't find some mention of it.
What about clusters with small amount of keys, lets say 200 heavy keys distributed to different slots, would i need to run 16384 scans to get all the keys in my cluster? The default count is 10 in the classic implementation so it would be around 20 calls, here its seems to be almost x1000 more calls.
The classic implementation is based on the node dictht which its size and the inside key distribution is kind of equivalent to the amount of keys (worse case amount of keys X 2), which promise some efficiency in this kind of scenarios.
With the new impl' offered, if the amount of keys i have is smaller than the 163840 i still need to add my own impl' for scanning efficiently.

from valkey.

ranshid commented on August 15, 2024

What about clusters with small amount of keys, lets say 200 heavy keys distributed to different slots, would i need to run 16384 scans to get all the keys in my cluster? The default count is 10 in the classic implementation so it would be around 20 calls, here its seems to be almost x1000 more calls.

I think this is a valid point, which will be SOMEWHAT handled in case we will continue to consume continuance slot ranges (which I support).
This would leave the issue as more impactful for fragmented slot ranges, but even in case of fragmented slot ranges we expect to have some continuation. and there is always the ability of the client to fanout on all slots in case the application understand it's workload.

I would like to ask 2 other questions:

How do we plan to make these commands available via scripts/transactions? I would imagine CSCAN will probably be non-script, but SCAN with specific slot will be available right?
one thing that I think is missing from current scan, is filter by TTL. can we consider adding such an option?

from valkey.

CharlesChen888 commented on August 15, 2024

Note that [pattern] may imply a specific slot, and this is useful when slot is not provided, or when the provided slot is different from the slot pattern implies.

from valkey.

madolson commented on August 15, 2024

How do we plan to make these commands available via scripts/transactions? I would imagine CSCAN will probably be non-script, but SCAN with specific slot will be available right?

Why would CSCAN be non-script? I don't see anything that would strictly break from it.

one thing that I think is missing from current scan, is filter by TTL. can we consider adding such an option?

Sounds like a separate issue? What is the use case to filter by TTL?

from valkey.

[NEW] Redis consistent cluster SCAN about valkey HOT 13 OPEN

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent