Coder Social home page Coder Social logo

Comments (12)

tgross avatar tgross commented on July 26, 2024

Should we be supporting a more flexible duration syntax? For example: ISO8601 Durations. For the metrics use-case, the frequency of events will be very short duration. Backups could potentially be longer (hours). Certainly I don't think there would be multi-day pauses between execution.

The architecture we have for Containerbuddy hasn't been particularly optimized for high-performance and forking off lots of processes at subsecond intervals might prove to be costly. I would suggest that if we want to allow for subsecond that we do some performance testing of that design.

Should we build in the idea of exponential back-off to deal with back-pressure? If a command results in a specific non-zero exit code, we reduce the frequency to give the target system time to recover. Note: This may be useless or harmful for larger frequencies (Hours), so we might need to also specify the retry frequency as well in order to fix that.

This also brings up the question of semantics of the frequency. Under the current design of poll the polling goroutine is blocked by the execution of the pollingFunc we give it. This is by design because we don't want (for example) multiple health checks running simultaneously -- if a health check can't return within the TTL then the node should be marked unhealthy anyways.

For tasks that run long is this still the correct behavior?

from containerpilot.

justenwalker avatar justenwalker commented on July 26, 2024

Under the current design of poll the polling goroutine is blocked by the execution of the pollingFunc we give it. This is by design because we don't want (for example) multiple health checks running simultaneously -- if a health check can't return within the TTL then the node should be marked unhealthy anyways.

For tasks that run long is this still the correct behavior?

I think so. It usually doesn't make sense for periodic tasks to overlap. If they do it is probably a mistake. Either:

  1. The new tasks will block until the old one is finished
  2. The new task will kill the old one before running

Perhaps the semantics could be configurable

Consider the use case of the backup. If the first backup didn't finish, should we wait for it to complete? or kill it and start another? Probably the former?

However, In the case of pushing metrics, if the last one didn't complete - we should probably just kill it since it is stale now anyway.

We don't want to fork too many processes, so I don't think we should support scheduled tasks which overlap and continue to spawn more and more processes that don't exit. Perhaps both can be accomplished with a timeout on the scheduled task (which may default to the frequency?)

{
  "onScheduled": [
    { "frequency": "1s", "command": [ "/bin/push_metrics.sh" ] },
    { "frequency": "10s", "command": [ "/bin/push_other_metrics.sh" ], "timeout": "5s" }
  ]
}

from containerpilot.

tgross avatar tgross commented on July 26, 2024

Perhaps both can be accomplished with a timeout on the scheduled task (which may default to the frequency?)

I like this idea. To support this we may need to update (or replace) executeAndWait to support a communications channel for stopping the process.

@justenwalker I'm planning on tackling #27 as my next major task for Containerbuddy. Do you want to take ownership of this project?

from containerpilot.

justenwalker avatar justenwalker commented on July 26, 2024

Sure 😋 🍪

from containerpilot.

tgross avatar tgross commented on July 26, 2024

When we do this, let's try and split out the functionality into its own library that main calls (per #83). This will reduce the scope of refactoring when we do #83 and give us some guidance on how to do it.

from containerpilot.

justenwalker avatar justenwalker commented on July 26, 2024

Just an update @tgross I started a WIP branch if you want to follow it. not ready for PR yet, but perhaps we can discuss the implementation i'm going with.

Also I didn't split out the module yet, but I'll work on that too.

from containerpilot.

tgross avatar tgross commented on July 26, 2024

Cool, I've got https://github.com/tgross/containerbuddy/tree/gh27_metrics in the works myself and I started with splitting the module out just as a "let's make sure that I can call it correctly."

I realize you're early in the process but you may find what you've done with ScheduledTaskConfig tricky in terms of avoiding a circular import with config.go. I suspect you'll want to move that into the containerbuddy (and, later, config) package, but overall this is looking like a good approach.

from containerpilot.

tgross avatar tgross commented on July 26, 2024

More thoughts on module split-up here: #83 (comment)

from containerpilot.

tgross avatar tgross commented on July 26, 2024

@justenwalker just because the timing is inconvenient on that first stage refactor, I'm going to try to push it out early Monday morning so that we can use it to base our new packages on. This way it's not getting delayed and then we end up both having to rework sections of metrics and tasks to suit. And, as noted in that #83 it'll give us a chance to make sure it's the right abstraction before refactoring the rest of the modules.

from containerpilot.

tgross avatar tgross commented on July 26, 2024

Actually that turned out to be a much smaller intervention than I'd thought so I've opened #118.

from containerpilot.

tgross avatar tgross commented on July 26, 2024

Once we get a green build on master back from TravisCI, I'll cut release 2.1.0 with this in it.

from containerpilot.

tgross avatar tgross commented on July 26, 2024

Released in https://github.com/joyent/containerpilot/releases/tag/2.1.0

from containerpilot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.