Coder Social home page Coder Social logo

habitat-sh / habitat-operator Goto Github PK

View Code? Open in Web Editor NEW
61.0 12.0 17.0 29.69 MB

A Kubernetes operator for Habitat services

License: Apache License 2.0

Go 90.03% Makefile 2.97% Shell 6.27% Dockerfile 0.09% Mustache 0.64%
kubernetes habitat operator kubernetes-cluster

habitat-operator's Introduction

Build Status Go Report Card

habitat-operator

This project is currently unstable - breaking changes may still land in the future.

Overview

The Habitat operator is a Kubernetes controller designed to solve running and auto-managing Habitat Services on Kubernetes. It does this by making use of Custom Resource Definitions.

To learn more about Habitat, please visit the Habitat website.

For a more detailed description of the Habitat type have a look here.

Prerequisites

  • Habitat >= 0.52.0
  • Kubernetes cluster with version 1.9.x, 1.10.x or 1.11.x
  • Kubectl version 1.10.x or 1.11.x

Installing

Make sure you have golang compiler installed. Follow the installation instructions on download page to learn about GOPATH. Then run following command:

go get -u github.com/habitat-sh/habitat-operator/cmd/habitat-operator

This will put the built binary in $GOPATH/bin, make sure this is in your PATH, so you can access the binary from anywhere.

Building manually from source directory

Clone the code locally:

go get -u github.com/habitat-sh/habitat-operator/cmd/habitat-operator
cd ${GOPATH:-$HOME/go}/src/github.com/habitat-sh/habitat-operator

Then build it:

make build

This command will create a habitat-operator binary in the source directory. Copy this file somewhere to your PATH.

Usage

Running outside of a Kubernetes cluster

Start the Habitat operator by running:

habitat-operator --kubeconfig ~/.kube/config

Running inside a Kubernetes cluster

Building image from source

First build the image:

make image

This will produce a habitat/habitat-operator image, which can then be deployed to your cluster.

The name of the generated docker image can be changed with an IMAGE variable, for example make image IMAGE=mycorp/my-habitat-operator. If the habitat-operator name is fine, then a REPO variable can be used like make image REPO=mycorp to generate the mycorp/habitat-operator image. Use the TAG variable to change the tag to something else (the default value is taken from git describe --tags --always) and a HUB variable to avoid using the default docker hub.

Using release image

Habitat operator images are located here, they are tagged with the release version.

Deploying Habitat operator

Cluster with RBAC enabled

Make sure to give Habitat operator the correct permissions, so it's able to create and monitor certain resources. To do it, use the manifest files located under the examples directory:

kubectl create -f examples/rbac

For more information see the README file in RBAC example

Cluster with RBAC disabled

To deploy the operator inside the Kubernetes cluster use the Deployment manifest file located under the examples directory:

kubectl create -f examples/habitat-operator-deployment.yml

Deploying an example

To create an example service run:

kubectl create -f examples/standalone/habitat.yml

This will create a single-pod deployment of an nginx Habitat service.

More examples are located in the example directory.

Contributing

Dependency management

This project uses go dep >= v0.4.1 for dependency management.

If you add, remove or change an import, run:

dep ensure

Testing

To run unit tests locally, run:

make test

Clean up after the tests with:

make clean-test

Our current setup does not allow e2e tests to run locally. It is best run on a CI setup with Google Cloud.

Code generation

If you change one of the types in pkg/apis/habitat/v1beta1/types.go, run the code generation script with:

make codegen

habitat-operator's People

Contributors

asymmetric avatar defilan avatar iaguis avatar indradhanush avatar kosyfrances avatar krnowak avatar lilic avatar surajssd avatar zeenix avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

habitat-operator's Issues

Topology should default to `none`

Currently the topology options in the operator are standalone and leader/follower, we need to add an option of no topology. The default option in Habitat is none and the operator should default to that as well, which means:

the difference between standalone and none is that none will never update itself

See further discussion for this here.

Switch to pflag in test

Tried using pflag but it did not work, because flag registering/parsing seem to get overridden. Maybe it's because other parts of the code/dependencies parse things through init methods, but it needs some further investigation.

My attempt:

type testFlag struct {
	image      string
	kubeconfig string
	externalIP string
}
....
flags := flag.NewFlagSet(os.Args[0], flag.ContinueOnError)

flags.StringVar(&tf.image, "image", "", "habitat operator image, 'kinvolk/habitat-operator'")
flags.StringVar(&tf.kubeconfig, "kubeconfig", "", "path to kube config file")
flags.StringVar(&tf.externalIP, "ip", "", "external ip, eg. minikube ip")

flags.Parse(os.Args[2:]) // As the previous flags are test related flags.

Fix hard dependency to minikube in Makefile

If minikube is not present/running, building the project results in an error being displayed:

❯ make
E0830 11:51:01.433090   24624 ip.go:48] Error getting IP:  Host is not running
go build -i github.com/kinvolk/habitat-operator/cmd/operator

Add support for --peer-watch-file

The supervisor should be started with the --peer-watch-file flag, if the CRD had a key topology: leader.

The flag should be passed as an argument to the container.

ConfigMap not found error

After deleting the SG with kubectl, the operator still receives events on the Pod handler.

In those handlers, we expect the ConfigMaps to be there, but they aren't (as they are deleted in the onDelete), so we get errors like

level=error component=controller msg="configmaps \"example-encrypted-service-group\" not found"

Increase number of workers

We can add a parameter to control how many workers are started. These workers will pop jobs from the workqueue in parallel, improving performance.

See this for an example implementation.

Add support for --group flag

The habitat client supports the --group flag, which allows users to start the supervisors in a specific group.

In order to support this, we need to:

  • Add a group key to the CRD
  • Pass the --group flag as arguments to to the containers

Introduce workqueue

To make sure individual events don't interfere with each other, upstream Kubernetes suggestion is for controllers to implement a workqueue.

More info can be found here.

Remove dependency on minikube

The e2e target in the Makefile calls minikube ip.

It would be good not to expect a minikube installation to be present, and get the cluster IP some other way, e.g. by parsing ~/.kube/config.

RBAC rules for Operator

Create role based access control rules for the Habitat operator.

Kubernetes RBAC has been promoted to v1 in Kubernetes 1.8 and major Kubernetes distributions turn it on by default which means, that the Kubernetes apiserver will deny all access to its APIs by default. RBAC is there to enable access to those APIs.

The Habitat operator makes heavy use of the Kubernetes APIs, therefore we need to document the required RBAC roles, in order for users to run the Habitat operator in a secure manner.

Decrease e2e tests running time

End-to-end tests currently run for 10+ minutes.

Ideas:

  • Disable Travis' PR test
  • Find a way to only run certain tests sometime

Errors when deleting Habitat resource

When deleting a Habitat, the following errors are displayed:

ts=2017-11-08T12:02:17.98051371+01:00 level=info component=controller msg="deleted deployment" name=example-leader-follower-habitat
ts=2017-11-08T12:02:17.983844937+01:00 level=error component=controller msg="deployments.apps \"example-leader-follower-habitat\" not found"
ts=2017-11-08T12:02:17.98428139+01:00 level=error component=controller msg="Habitat could not be synced, requeueing" msg="deployments.apps \"example-leader-follower-habitat\"
 not found"

The actual Habitat is removed.

This seems like it could have been introduced in #113.

Configuration flags from `glog` leak into binary

Some of the flags returned by --help seem to come from the glog library, and have no effect on our own logging (e.g. -v value).

❯ ./operator --help                                                                            
Usage of ./operator:                                                                           
  -alsologtostderr                                                                             
        log to standard error as well as files                                                 
  -kubeconfig string                                                                           
        Path to a kubeconfig. Only required if out-of-cluster.                                 
  -log_backtrace_at value                                                                      
        when logging hits line file:N, emit a stack trace                                      
  -log_dir string                                                                              
        If non-empty, write log files in this directory                                        
  -logtostderr                                                                                 
        log to standard error instead of files                                                 
  -stderrthreshold value                                                                       
        logs at or above this threshold go to stderr                                           
  -v value                                                                                     
        log level for V logs                                                                   
  -vmodule value                                                                               
        comma-separated list of pattern=N settings for file-filtered logging

Create unit tests

The unit tests should include testing individual functions in the main code base.

Use cache instead of making API calls

Whenever possible, we should use the cache returned by the cache.NewInformer function, instead of making API calls, to retrieve objects from the API.

Depends on #63.

Run E2E tests automatically

This issue is to explore ways in which we can make running our E2E tests part of the CI process.

  • Parallelizing Travis jobs using the build matrix
  • Using the -j flag in make
  • Using a cron job to periodically run the E2E tests on master
  • Using a custom bash script that only runs the E2E tests on master
  • Only running the E2E tests in the job that tests the PR merge commit (e.g. checking $TRAVIS_PULL_REQUEST in a bash script)

Add Deployment watcher

We should have watchers for all resources that affect Habitats.

We have a pod watcher, but we still need to add a Deployment one.

Depends on #63.

Release process

There should be a documented release process, i.e. steps we need to take for when doing a new release of the Habitat operator. Here are a few steps that come to my mind right now:

  • Build image make linux
  • Tag image to the release following versioning strategy vx.x.x
  • Push the image to the version tag as well as latest to hub.docker.com
  • Bump up tag version in the deployment Habitat operator manifest file examples/habitat-operator-deployment.yml
  • Update CHANGELOG.md with release notes

Update deployment when Habitat object is updated

Currently, if the Habitat object is updated (i.e. image name or number of replicas was changed), the deployment is not updated. Our reconciler handles only creating deployment if it doesn't exist yet. In order to do upgrades of Habitat application on Kubernetes, we need to handle image name updates.

Add support for custom namespace

We should allow the user to create the Custom Object in a namespace of their choosing.

If none is specified, it will be created in the default namespace.

Create deployment `onAdd`

Once the operator has received a new CR, it needs to create a Deployment using the CR's parameters.

Switch to a different logging library?

log-kit has the disadvantage that the logger instance has to be (or should be) passed around for logging to be possible.

Other libraries, like glog don't have this UX issue.

Should we switch? Some people also like log15.

Running a hab Service Group inside of k8s

This is about running a Service Group as a collection of manually created pods, and confirming that the supervisors in the SG are able to talk to one another, and, for example, elect a leader.

Two types of pods will have to be crated, since we're still relying on the current --peer mechanism, and therefore we run supervisors with different flags.

Checklist

  • Pods can fetch from the internet
  • Leader election succeeds

Use StatefulSets instead of Deployment

Currently we are using Deployments to deploy our Habitat Service, but since we do not know what are deploying, what type of service that is, it could be anything from a DB to a simple Rails application. We should not just assume our Habitat service would be stateless.

Couple of advantages of StatefulSets:

  • Graceful deployment and scaling
  • Stable network identity
  • Graceful deletion and termination
  • Stable, persistent storage

These would be very useful especially if our service is for example a DB.

Handle Ring Key

The operator should auto-generate a ring key and/or accept one provided by the user. This way all the containers are secured at the gossip layer.

More info here.

Use ownerReferences with CRD

Using OwnerReferences allows us to define relationships between Resources, so that deleting an owner can automatically delete owned resources.

Currently, we use OwnerReferences to associate a ConfigMap with a Deployment.

It could be useful to make all Resources we create dependent on the CustomResource, but there might currently be problems with that.

Demo: Bind plus initial configuration

Create a demo for the operator. Demo will showcase the following features:

  • One Service group bound to a database.
  • We override the port on which the database listens on and display that port information in the first service.
  • Similar to the bind demo, but also displaying how different fields in the manifest file (Habitat features) can be used together (configuration and bind feature in this case).

Rename CRD ServiceGroup

Current name ServiceGroup conflicts with concept of a group in Habitat, as well as concept of a Service in Kubernetes, we need to come up with a better name for CRD.

List of ideas:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.