Currently Horde.Supervisor only supports permanent pr

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

I have a little WIP branch up for you to look at <a class="issue-link js-issue-link" d

Implemented in <a class="issue-link js-issue-link" data-error-text="Failed to load tit

Horde.Supervisors supports transient and temporary processes about horde HOT 10 CLOSED

derekkraan commented on July 3, 2024 2

Horde.Supervisors supports transient and temporary processes

from horde.

Comments (10)

derekkraan commented on July 3, 2024

Hi @QuantamHD,

This is correct. We use DynamicSupervisor for doing the actual supervision, and the challenge with temporary and transient processes is reflecting this information in the δ-CRDT. In the case of permanent processes, the state is leading, but in the case of temporary or transient processes, this is reversed.

For an overview of the behaviour of temporary and transient, see https://hexdocs.pm/elixir/Supervisor.Spec.html#module-restart-values-restart

I have shied away from doing "real" supervision in Horde.Supervisor, preferring to delegate this to DynamicSupervisor. This won't be possible anymore if we support transient and temporary children.

If you have any other questions, don't hesitate to ask!

from horde.

QuantamHD commented on July 3, 2024

In terms of "leading" state do you mean that the state can be set before process creation and not modified afterward for permanent strategies? While for the temporary and transient cases require CRDT modification at process creation time as well as process exits?

Could we achieve this feature with monitors on the processes managed by the DynamicSupervisor or do you think we would need a fully implemented supervisor behavior?

from horde.

derekkraan commented on July 3, 2024

What I mean is, with permanent restart, a process is only removed from the supervisor by us, by calling DynamicSupervisor.terminate_child. With transient and temporary restart, the process is removed by the supervisor itself when the process exits (depending on the exit reason and the strategy).

We could achieve this with monitors on the processes that are managed by the DynamicSupervisor, but the complexity of our "supervision by proxy" approach is increasing every time we extend the approach, and I think we should consider (and I have been considering) whether Horde could benefit from fully implementing its own supervisor instead of relying on DynamicSupervisor.

Having our own supervisor implementation in Horde would also simplify some other parts of the code, like the graceful shutdown bits, which is already a bit of a hack.

Do you have any thoughts on this?

from horde.

QuantamHD commented on July 3, 2024

I would say that makes sense to me. We could also base the implementation off of https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/dynamic_supervisor.ex given that Horde currently tries to maintain API parity with it.

from horde.

QuantamHD commented on July 3, 2024

One question I do have is can a temporary process actually be restarted in Horde given the strict definition provided in the docs? It sounds like a node down should kill it.

from horde.

derekkraan commented on July 3, 2024

We could also base the implementation off Elixir's DynamicSupervisor

This sounds like a very good option. If our changes are limited, then we will also be able to port any changes / bugfixes etc from the stdlib DynamicSupervisor back to Horde's forked version and keep them mostly in sync.

One question I do have is can a temporary process actually be restarted in Horde given the strict definition provided in the docs? It sounds like a node down should kill it.

I hadn't thought about this. We are extending Supervisor into places it has not been before, so some interpretation is required. In this case I think never restarting is the correct behaviour as you say. For processes that should always be restarted until they have a normal exit code, we have :transient.

from horde.

QuantamHD commented on July 3, 2024

Keeping the bug fixes of stdlib DynamicSupervisor sounds great! It looks like we would need to insert CRDT modification code here. Could you explain to me the current "supervision by proxy" strategy employed by Horde? I see that you select nodes bases a consistent hash scheme, and it looks like you forward child specs to that node based on that. If we employed a custom Supervisor how might that strategy change?

from horde.

derekkraan commented on July 3, 2024

The current strategy is to let DynamicSupervisor do all our supervision for us. So we proxy all calls either down to the DynamicSupervisor or to the DeltaCRDT (depending on what information is desired).

There is already a use case that I have had to work around (when the CRDT is not "leading"), and that's during a shutdown. During a shutdown, what we want is to update the CRDT state for each process as they shut down, and not at the end, after all processes have shut down. This is because there could be a big difference in shutdown time per process, and we don't want to have a scenario where a process shuts down and then has to wait for all other processes on the node to shut down before it is started on another node. So this is already a chink in the armour of our "supervision by proxy" approach.

And the way this works in practice is that there is a "graceful shutdown" process that recognizes when a shutdown has been initiated and proxies terminate messages from the ProcessCanary processes when there is a shutdown ongoing.

So this "graceful shutdown" behaviour is a candidate for improvement if we fork DynamicSupervisor.

I don't think the hash scheme ("distribution strategy") details will change if we fork DynamicSupervisor. This is purely for choosing which node a process should start on, and doesn't have anything to do with supervision of the process once it has actually started.

from horde.

QuantamHD commented on July 3, 2024

I have a little WIP branch up for you to look at #62 I added and modified the dynamic supervisor so that child_ids aren't thrown away and on process deletion, the supervisor will also remove the child_id from the CRDT. I have one failing test related to deduping I was hoping you could give me a hand with

from horde.

derekkraan commented on July 3, 2024

Implemented in #70

from horde.

Horde.Supervisors supports transient and temporary processes about horde HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent