Coder Social home page Coder Social logo

Comments (3)

famarting avatar famarting commented on June 11, 2024

if you look at the dapr sidecar logs you will see entries like

WARN[0513] Workflow actor '108adc75-08df-494b-99ec-65735f690802': execution timed-out and will be retried later: 'context deadline exceeded'  app_id=wfapp instance=MacBook-Pro-de-Fabian.local scope=dapr.wfengine.backend.actors type=log ver=1.13.2
WARN[0573] Workflow actor '108adc75-08df-494b-99ec-65735f690802': execution timed-out and will be retried later: 'context deadline exceeded'  app_id=wfapp instance=MacBook-Pro-de-Fabian.local scope=dapr.wfengine.backend.actors type=log ver=1.13.2
WARN[0613] Activity actor '108adc75-08df-494b-99ec-65735f690802::1::1': 'run-activity' is still running - will keep waiting until '2024-04-25 11:33:18.632479 +0200 CEST m=+3613.908154293'  app_id=wfapp instance=MacBook-Pro-de-Fabian.local scope=dapr.wfengine.backend.actors type=log ver=1.13.2

what is happening with this test it that it starts the worker that connects via grpc with the dapr sidecar, and then it schedules the workflow so it starts running, and as soon as the workflow starts running your test exits which also exits the worker and the grpc connection to the sidecar closes.

To my understanding, the workflow engine cannot move forward with the event log for this workflow, because it cannot send commands to the application. If it cannot move forward the event log it cannot process the workflow terminate command and the workflow gets stuck retrying any previous command (which in this case was an activity execution most likely)

I don't have sufficient knowledge on the workflow engine to propose a solution but to me it looks like there is a bit of a disconnect between the workflow actor and the engine, the actor tries to send work to the engine so it sends it to the app, but if the engine is not connected to the app nothing works. Maybe there should be some optimization or logic that breaks this kind of retry loop if a terminate command is detected. IDK if the client side MUST receive the terminate workflow command to safely terminate the workflow or if its safe to terminate the workflow from the backend POV if the connection to the application is absent.

from dapr.

olitomlinson avatar olitomlinson commented on June 11, 2024

cc @cgillum

from dapr.

cgillum avatar cgillum commented on June 11, 2024

Yes, I believe @famarting is correct here. If the worker has disconnected from the sidecar, then it will be unable to receive and process the terminate message, leaving the workflow stuck in the RUNNING state.

Termination works by sending a message to a workflow. When the workflow receives the terminate message, it transitions itself into a completed state with the TERMINATED runtime status. The terminate logic is not implemented at the sidecar/engine/actor layer. If you reconnect your worker app to the Dapr sidecar, then the terminate message should get handled and the workflow will terminate.

Maybe there should be some optimization or logic that breaks this kind of retry loop if a terminate command is detected. IDK if the client side MUST receive the terminate workflow command to safely terminate the workflow or if it's safe to terminate the workflow from the backend POV if the connection to the application is absent.

I think this can be considered as an optimization, but it would need to be implemented carefully to ensure that the workflow state is correctly updated in the same way as when a workflow transitions itself into a terminated state, and that the OTel spans are properly emitted.

from dapr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.