Comments (12)
Should we be supporting a more flexible duration syntax? For example: ISO8601 Durations. For the metrics use-case, the frequency of events will be very short duration. Backups could potentially be longer (hours). Certainly I don't think there would be multi-day pauses between execution.
The architecture we have for Containerbuddy hasn't been particularly optimized for high-performance and forking off lots of processes at subsecond intervals might prove to be costly. I would suggest that if we want to allow for subsecond that we do some performance testing of that design.
Should we build in the idea of exponential back-off to deal with back-pressure? If a command results in a specific non-zero exit code, we reduce the frequency to give the target system time to recover. Note: This may be useless or harmful for larger frequencies (Hours), so we might need to also specify the retry frequency as well in order to fix that.
This also brings up the question of semantics of the frequency. Under the current design of poll
the polling goroutine is blocked by the execution of the pollingFunc
we give it. This is by design because we don't want (for example) multiple health checks running simultaneously -- if a health check can't return within the TTL then the node should be marked unhealthy anyways.
For tasks that run long is this still the correct behavior?
from containerpilot.
Under the current design of poll the polling goroutine is blocked by the execution of the pollingFunc we give it. This is by design because we don't want (for example) multiple health checks running simultaneously -- if a health check can't return within the TTL then the node should be marked unhealthy anyways.
For tasks that run long is this still the correct behavior?
I think so. It usually doesn't make sense for periodic tasks to overlap. If they do it is probably a mistake. Either:
- The new tasks will block until the old one is finished
- The new task will kill the old one before running
Perhaps the semantics could be configurable
Consider the use case of the backup. If the first backup didn't finish, should we wait for it to complete? or kill it and start another? Probably the former?
However, In the case of pushing metrics, if the last one didn't complete - we should probably just kill it since it is stale now anyway.
We don't want to fork too many processes, so I don't think we should support scheduled tasks which overlap and continue to spawn more and more processes that don't exit. Perhaps both can be accomplished with a timeout on the scheduled task (which may default to the frequency?)
{
"onScheduled": [
{ "frequency": "1s", "command": [ "/bin/push_metrics.sh" ] },
{ "frequency": "10s", "command": [ "/bin/push_other_metrics.sh" ], "timeout": "5s" }
]
}
from containerpilot.
Perhaps both can be accomplished with a timeout on the scheduled task (which may default to the frequency?)
I like this idea. To support this we may need to update (or replace) executeAndWait
to support a communications channel for stopping the process.
@justenwalker I'm planning on tackling #27 as my next major task for Containerbuddy. Do you want to take ownership of this project?
from containerpilot.
Sure 😋 🍪
from containerpilot.
When we do this, let's try and split out the functionality into its own library that main
calls (per #83). This will reduce the scope of refactoring when we do #83 and give us some guidance on how to do it.
from containerpilot.
Just an update @tgross I started a WIP branch if you want to follow it. not ready for PR yet, but perhaps we can discuss the implementation i'm going with.
Also I didn't split out the module yet, but I'll work on that too.
from containerpilot.
Cool, I've got https://github.com/tgross/containerbuddy/tree/gh27_metrics in the works myself and I started with splitting the module out just as a "let's make sure that I can call it correctly."
I realize you're early in the process but you may find what you've done with ScheduledTaskConfig
tricky in terms of avoiding a circular import with config.go. I suspect you'll want to move that into the containerbuddy
(and, later, config
) package, but overall this is looking like a good approach.
from containerpilot.
More thoughts on module split-up here: #83 (comment)
from containerpilot.
@justenwalker just because the timing is inconvenient on that first stage refactor, I'm going to try to push it out early Monday morning so that we can use it to base our new packages on. This way it's not getting delayed and then we end up both having to rework sections of metrics and tasks to suit. And, as noted in that #83 it'll give us a chance to make sure it's the right abstraction before refactoring the rest of the modules.
from containerpilot.
Actually that turned out to be a much smaller intervention than I'd thought so I've opened #118.
from containerpilot.
Once we get a green build on master back from TravisCI, I'll cut release 2.1.0 with this in it.
from containerpilot.
Released in https://github.com/joyent/containerpilot/releases/tag/2.1.0
from containerpilot.
Related Issues (20)
- Stability issues with signal events under SmartOS/LX HOT 2
- [Question]How to disable default metrics and only response custom defined metrics? HOT 5
- Building inside a docker container HOT 2
- SmartOS and LX brand issues with Go 1.9
- Docs incorrectly say 'initialStatus', should be 'initial_status' HOT 2
- Telemetry custom metrics always zero HOT 4
- Run as user per job HOT 5
- Allow for an ADHoc Sending of a signal to a ContainerPilot job. HOT 1
- Project status HOT 3
- Error parsing environment variable in config template
- Unable to execute job HOT 1
- CP ends up ignoring that it's jobs have been killed
- Local build on SmartOS fails due to upstream changes
- 100% CPU Usage
- github url's in documentation
- Documentation Update: docker-compose --scale "change"
- consul with TLS does not read env vars set by -putenv
- Broken link to blog/wordpress-on-autopilot
- Container Pilot process get hung and cannot recover when health check timeouts continues for more than an hour
- Support consul service meta data HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from containerpilot.