Coder Social home page Coder Social logo

gomjabbar's Introduction

GomJabbar - Chaos Monkey for your private cloud

Build Status

What is GomJabbar?

GomJabbar is a service inspired by Netflix's ChaosMonkey, but unlike ChaosMonkey, it was designed to work with your private cloud infrastructure (i.e. your own data centers).

The service exposes endpoints that allow you to randomly select targets, trigger a selected fault, and revert when needed.

Why should I run GomJabbar?

You can find the Netflix explanation here. No point in copying that over ;)

The main idea is to reduce our fear from production (fear is the mind killer remember?). If you want to learn how to improve your code, monitoring, and alerting system, learn how to deal with production issues when you're awake and ready, this is the tool for you.

After running several chaos drills at Outbrain, I can assure you that doing this on a regular basis is extremely valueable. During a midnight page most people will not fix anything, nor investigate too far, and the incident will usually end with a service restart. During a chaos drill we look deeper into the root causes, and try to learn what we need to fix, and where we need to improve. After every drill we conduct a quick take-in and implement the fixes as soon as possible.

Running GomJabbar helps us validate our assumptions, our infrastructure, our resilience, and our fixes.

Supported faults

GomJabbar supports an extensible fault injection mechanism, along with a configuration based fault triggering commands and scripts. The example config file contains examples ranging from harmless failures to graceful / graceless shutdowns and traffic control (network issues emulation).

Integration

Service Discovery

We currently integrate with consul out of the box, and provide a configuration based filtering for the targets. Future versions will integrate with other service discovery methods, and the tool was designed to easily support this.

Fault Automation

Gom Jabbar now integrates with RunDeck, and Ansible. Future versions may provide other automation tools, or a built-in ssh capabilities / agents.

User Guide

User Guide

gomjabbar's People

Contributors

dependabot[bot] avatar eranharel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gomjabbar's Issues

Implement a scheduling mechanism to allow this service to run automatically

Scheduling must be configurable to allow time intervals for periodic drills and overrides for times where we want to skip drills.

An auto scheduling drill will randomly select targets and fault injectors, and run them at a random time during the specified interval, and optionally auto-revert at the end of the interval.

Improve audit

Current audit implementation is file based, and revert commands must be idempotent..

Implement a persistent shared audit log, and at least some sort of marking for done / reverted fault executions.

method POST not allowed with consul 1.4.0

I'm using the latest version of GJ with consul 1.4.0 on rh 7.5, maven 3.0.5 and jdk 1.8.0. looks like since consul 1.0 they've started enforcing HTTP verbs in many endpoints where before anything was allowed. So now I'm getting this error:
`./gomjabbar.sh
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building GomJabbar 0.1-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ GomJabbar ---
11:23:51,238 INFO ConsulTargetsCache:190 - Reloading cache...
Audit Log will be written to /tmp/GomJabbar_audit2098517009179225973
11:23:51,361 ERROR ConsulTargetsCache:96 - Failed to fetch targets
java.io.IOException: Call failed for url: http://localhost:8500/v1/catalog/datacenters, status code: 405.
method POST not allowed
at com.outbrain.ob1k.http.marshalling.JacksonMarshallingStrategy.unmarshall(JacksonMarshallingStrategy.java:46)
at com.outbrain.ob1k.common.marshalling.JsonRequestMarshaller.unmarshallResponse(JsonRequestMarshaller.java:141)
at com.outbrain.ob1k.client.endpoints.AsyncClientEndpoint$1.unmarshall(AsyncClientEndpoint.java:37)
at com.outbrain.ob1k.http.ning.NingResponse.getTypedBody(NingResponse.java:78)
at com.outbrain.ob1k.http.ning.NingRequestBuilder.lambda$asValue$2(NingRequestBuilder.java:328)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.lambda$flatMap$8(EagerComposableFuture.java:210)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture$ConsumerAction.run(EagerComposableFuture.java:426)
at com.outbrain.ob1k.concurrent.eager.HandlersList.execute(HandlersList.java:55)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.done(EagerComposableFuture.java:181)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.set(EagerComposableFuture.java:164)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.setTry(EagerComposableFuture.java:155)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture$ConsumerAction.run(EagerComposableFuture.java:426)
at com.outbrain.ob1k.concurrent.eager.HandlersList.addHandler(HandlersList.java:27)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.consume(EagerComposableFuture.java:309)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.consumeFrom(EagerComposableFuture.java:346)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.lambda$flatMap$8(EagerComposableFuture.java:211)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture$ConsumerAction.run(EagerComposableFuture.java:426)
at com.outbrain.ob1k.concurrent.eager.HandlersList.execute(HandlersList.java:55)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.done(EagerComposableFuture.java:181)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.set(EagerComposableFuture.java:164)
at com.outbrain.ob1k.concurrent.eager.EagerComposableFuture.lambda$build$0(EagerComposableFuture.java:73)
at com.outbrain.ob1k.http.utils.ComposableFutureAdapter.lambda$null$0(ComposableFutureAdapter.java:24)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
11:23:51,364 INFO ConsulTargetsCache:199 - Scheduling the next cache reload in 5 minutes
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/api/faultOptions ==> com.outbrain.gomjabbar.GomJabbarServiceImpl.faultOptions(), via method: ANY]
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/api/log ==> com.outbrain.gomjabbar.GomJabbarServiceImpl.log(), via method: ANY]
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/api/revert ==> com.outbrain.gomjabbar.GomJabbarServiceImpl.revert(faultId), via method: ANY]
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/api/revertAll ==> com.outbrain.gomjabbar.GomJabbarServiceImpl.revertAll(), via method: ANY]
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/api/selectTarget ==> com.outbrain.gomjabbar.GomJabbarServiceImpl.selectTarget(), via method: ANY]
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/api/trigger ==> com.outbrain.gomjabbar.GomJabbarServiceImpl.trigger(targetToken, faultId), via method: ANY]
11:23:51,405 INFO ServiceRegistry:313 - Registered endpoint [/gj/endpoints ==> com.outbrain.ob1k.server.endpoints.EndpointMappingService.handle(), via method: ANY]
11:23:51,406 INFO ServiceRegistry:313 - Registered endpoint [/gj/endpoints/handle ==> com.outbrain.ob1k.server.endpoints.EndpointMappingService.handle(), via method: ANY]
11:23:51,406 INFO NettyServer:83 - ################## Starting OB1K server for module 'gj' ##################
11:23:51,462 INFO NettyServer:181 - **************** Module 'gj' Started ****************
11:23:51,463 INFO NettyServer:103 - server is up and bounded on address: /0:0:0:0:0:0:0:0:8080

|===+==================================================================|
| I must not fear. |
| Fear is the mind-killer. |
| Fear is the little-death that brings total obliteration. |
| I will face my fear. |
| I will permit it to pass over me and through me. |
| And when it has gone past I will turn the inner eye to see its path. |
| Where the fear has gone there will be nothing. |
| Only I will remain. |
| |
| (Litany Against Fear - Frank Herbert - Dune) |
=======================================================================|

11:23:51,463 INFO GomJabbarServer:46 - ## GomJabbarServer is started on port: 8080 ##
^C11:25:48,457 INFO NettyServer:153 - ################## Stopping OB1K server for module 'gj' ##################
11:25:48,458 INFO NettyServer:162 - ################## Closing OB1K server socket for module 'gj' ##################
11:25:48,459 INFO NettyServer:158 - ################## Closing OB1K server threads for module 'gj' ##################`

Support ssh based CommandExecutor

Current implementations require ansible or rundeck installation.
An SSH based solution will make this project more accessible to most companies.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.