fairwindsops / polaris Goto Github PK
View Code? Open in Web Editor NEWValidation of best practices in your Kubernetes clusters
Home Page: https://www.fairwinds.com/polaris
License: Apache License 2.0
Validation of best practices in your Kubernetes clusters
Home Page: https://www.fairwinds.com/polaris
License: Apache License 2.0
I am trying Polaris in a cluster with 11 namespaces, but only kube-syste, and polaris are shown in the report.
Any clue?
Thanks
There are some containers/pods/controllers that need to have access to a feature that might typically be disallowed by a Polaris configuration. For instance, a particular container might need runAsNonRoot=false
to function properly.
In these cases, we could disable that particular check for that particular resource, e.g. by using an annotation on the resource. This would cut down on noise in the report, and allow every team to strive for a score of 100.
Of course, we'll want to discourage folks from adding exceptions when they're unnecessary just to bypass Polaris, so this will take some thinking in terms of UX.
If a fix is merged into the master
branch, but not the polaris-latest
branch, it won't make it into polaris releases.
Would David have an asset here? Can also just crop the existing logo
The new releases have changelogs listed and appears to be on version 0.1.3 now, but https://github.com/reactiveops/polaris/blob/master/CHANGELOG.md has not been updated with those changes per release.
Currently pod names are left blank. Looks like it's not super trivial to get these plumbed through.
If you run polaris in many clusters, without creating an ingress (custom dns per each cluster), it would be nice if the header area also showed the cluster name (can be read from kubeconfig?)
Thoughts?
An update configuration schema should be designed that will support all of the existing validations along with all planned validations before v1, including:
SYS_ADMIN
(warning)/var/run/docker.sock
As is likely obvious here, this will also need to find some kind of way to differentiate from errors and warnings.
We should have a list of patterns that an image attribute must match one of, or a list of patterns that an image attribute should not match any of. Generally these would never be set at the same time, but I can't think of any reason we need to add any logic to ensure that they're never both set. I think the most sensible approach here is to ensure that container.image
matches at least one pattern defined in the whitelist or does not match any patterns defined in the blacklist. We do not want to do full regex matching here, but supporting a *
wildcard character would simplify things. Ideally we could create a small method that determined if a blacklist or whitelist item matched a string, and then write a bunch of unit tests to cover use cases like:
quay.io/reactiveops*
quay.io/reactiveops
(invalid)quay.io/reactiveops/
(valid)quay.io/unreactiveops
(invalid)quay.io/reactiveops/rbac-manager
(valid)quay.io/*ops*
quay.io/reactiveops
(invalid)quay.io/reactiveops/
(valid)quay.io/unreactiveops
(valid)quay.io/reactiveops/rbac-manager
(valid)quay.io/reactiveops/rbac-manager
quay.io/reactiveops
(invalid)quay.io/reactiveops/
(invalid)quay.io/unreactiveops
(invalid)quay.io/reactiveops/rbac-manager
(valid)Is it possible to add custom checks?
This should verify that container.pullPolicy
is not always
. A variety of similar validations are available in the codebase.
Hello,
I think it would be helpful to build a search input field to filter out one or more specific namespaces with statistics, when you only want to focus on these.
kind
kind create cluster --config=config.yaml && export KUBECONFIG=~/.kube/kind-config-kind
kubectl apply -f https://raw.githubusercontent.com/reactiveops/polaris/master/deploy/webhook.yaml
kubectl -n polaris describe pod polaris
Webhook pod should be in a good state and running.
Pod goes into CrashLoopBackOff.
kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
- role: worker
We would like to get polaris dashboard accessible from subpath of our domain (e.g. https://example.com/polaris
)
Issue is static content is not loaded as deployment expects them without polaris
path.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: example.com-polaris
namespace: polaris
annotations:
kubernetes.io/ingress.provider: "nginx"
kubernetes.io/ingress.class: "nginx-ks"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/secure-backends: "false"
nginx.ingress.kubernetes.io/rewrite-target: "/$1"
nginx.ingress.kubernetes.io/configuration-snippet: |
rewrite ^(/polaris)$ $1/ permanent;
spec:
tls:
- hosts:
- example.com
secretName: star-example-com
rules:
- host: cluster.notifai.io
http:
paths:
- path: /polaris/?(.*)
backend:
serviceName: polaris-dashboard
servicePort: 8080
Ideally this should include both some basic documentation and potentially cleaning up code to make the process simpler.
so i deployed a new version, 0.2 of polaris and it resulted in the below error.. i was able to solve by changing the clusterrole permissions for polaris-dashboard. below is the error for the reference.
time="2019-06-21T21:12:50Z" level=info msg="Starting Polaris dashboard server on port 8080" time="2019-06-21T21:13:19Z" level=error msg="Error fetching Nodes nodes is forbidden: User \"system:serviceaccount:polaris:polaris-dashboard\" cannot list resource \"nodes\" in API group \"\" at the cluster scope" time="2019-06-21T21:13:19Z" level=error msg="Error fetching Kubernetes resources nodes is forbidden: User \"system:serviceaccount:polaris:polaris-dashboard\" cannot list resource \"nodes\" in API group \"\" at the cluster scope"
The Kubernetes documentation states that readiness probes are not supported on initContainers:
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
Also, Init Containers do not support readiness probes because they must run to completion before the Pod can be ready.
Based on this, I think that lack of a readiness probe should be ignored for initContainers. This would bring my cluster's score up a bit. ๐
When running the binary release, it will break when there isn't a config.yaml file present. We should find a way to include default config directly in the binary distribution.
The example configs don't contain an example of setting the display name in yaml configs.
Also --display-name
is missing from the README.md
docs for flags
Would you accept a PR that added Polaris as a Terraform module?
We are using LimitRange for some namespaces. But Polaris still shows "CPU requests should be set" warning.
apiVersion: v1
kind: LimitRange
metadata:
name: limits
spec:
limits:
- defaultRequest:
cpu: 50m
memory: 256Mi
default:
cpu: 200m
memory: 256Mi
type: Container
After polaris --dashboard
command I am getting:
Error creating Kubernetes client No Auth Provider found for name
Error creating Kubernetes resources No Auth Provider found for name
PS. I am using dexter - a tool for creating and authenticating kubectl users via Google's OpenID Connect.
Not entirely sure that your check for runAsNonRoot is working, or we mis-understand exactly what it's checking. We have a pod running which is set at the pod level as below, yet reactive is still saying it shouldn't run as root... which it isn't. Since the securityContext at the container level is only for overriding what is set at the pod level I hope that being set at the pod level is enough. Any ideas?
securityContext:
runAsNonRoot: true
runAsUser: 5000
The idea here is to be able to whitelist or blacklist securityContext.capabilities
. This is going to be a bit more difficult to implement as we'll have to have our own concept of the defaults set here as I don't think it will actually show up in the spec here. Alternatively it might be more straightforward to have separate whitelist and blacklist support for capabilities that have been added and dropped. This one will take some time to figure out, and likely will involve some changes to the config syntax.
Ideally we should be able to use the same base set of data structures for both the webhook and dashboard, that's not the case yet. It would be good to consolidate and simplify these types as a starting point.
I prefer having values.yaml, i could not find one. If there is any, can you guys point me to one.
TIA
This involves 4 bits of configuration:
security.runAsPriviliged
: This container check should fail if container.securityContext.privileged
is true
security.runAsRootAllowed
: This pod and container check should fail if pod.securityContext.runAsNonRoot
or container.securityContext.runAsNonRoot
is not true
security.notReadOnlyRootFileSystem
: This container check should fail if container.securityContext.readOnlyRootFileSystem
is not true
security.privilegeEscalationAllowed
: This container check should fail if container.securityContext.allowPrivilegeEscalation
is true
If I don't get to it first, this will also involve some changes to the existing config package Security
struct.
launched Polaris in my cluster, found some problems with my deployments. Fixed a few things, re-deployed the offending applications... No change to the dashboard (even after an hour or two).
If I delete the Polaris pod, it'll restart and shows the expected changed errors/warnings.
How often does polaris scan the cluster? How do I trigger a new scan?
There is a call to config.GetConfigOrDie()
in resource.go:123. This attempts to load the config, but if unsuccessful will call an os.Exit(1)
from inside resourse.go
.
From a user experience perspective it may be better to leverage config.GetConfig()
and handle the err != nil
by returning an empty ResourceProvider
and an error message. Where this function is invoked, already has error handling which would capture a more usable message to the user instead of calling an os.Exit(1)
.
Is your feature request related to a problem? Please describe.
The webhook is the most potentially disruptive piece of code here, and is undertested. The reason for this is that it's difficult to unit test. That's a big part of why it's marked as experimental.
Describe the solution you'd like
My suggestion is that we write some end-to-end tests that add the webhook to a KIND cluster and make sure resources are accepted and rejected appropriately.
Describe alternatives you've considered
It would be nice if we could unit test as well...I wonder how well it plays with the fake k8s API we have in fixtures.go
The dashboard results.json output is not sending an application/json content-type. It's current set to plain text. It would be nice to have this header set so that viewers in the browser detect json.
This is a trivial and nice-to-have enhancement.
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
We'd like to be able to save an audit as JSON/YAML, and run the dashboard using that file
Describe alternatives you've considered
Additional context
We should add:
The copyright can be dropped into a footer. For the logo and link, curious to hear what people think. A few options I can think of:
For reference, Sonobuoy Scanner is referred to everywhere as "Heptio Sonobuoy Scanner", they put their logo at the bottom, along with a link saying "Made with love by Heptio".
It seems like there should be some info about packr in the CONTRIBUTING doc so that developers know how to build the project locally for dashboard development.
Values that should be validated here include:
.spec.hostAliases
.spec.hostIPC
.spec.hostNetwork
.spec.hostPID
.spec.ports.*.hostPort
The deployment template for the dashboard references the values image for webhook (.Values.webhook.image.repository) but it should be the dashboard one - .Values.dashboard.image.repository)
This ones are not necessary to store in the repo. It makes repo larger and longer to download.
Contributor can restore dependencies with dep util instead.
The Dashboard web page will not work in an offline environment because it tries to pull in javascript and css files from the internet. By default we prevent accessing the internet and are generally opposed to allowing accessing. These javascript and CSS files should be included in the docker image.
I'm noticing that we have lots of logic to create ResultSummaries
- basically iterating over an array of ResultMessage
s to count the number of successes/errors/warnings.
I think it'd simplify things to have a ResultSet
type:
type ResultSet struct {
Errors []ResultMessage
Warnings []ResultMessage
Successes []ResultMessage
}
Then rather than aggregating totals, we can just use len(results.Errors)
. It'll also make it easier for downstream consumers to segregate errors, warnings and successes...right now the list of ResultMessage
s seems to implicitly sort with error > warning > succcess
.
Any concerns with this approach? If not I'm happy to implement.
Steps to reproduce:
polaris
on my local machine with brewconfig-full.yaml
from the root of this repo, change the name to config.yaml
and make sure I'm in the same directorypolaris --dashboard
. Log outputINFO[0009] Starting Polaris dashboard server on port 8080
localhost:8080
Expected behaviour:
I see some sort of dashboard
Actual behaviour:
Console logs show
INFO[0021] Error getting template data stat /Users/<my username>/Code/polaris/templates/dashboard.gohtml: no such file or directory
Browser shows Error getting template data
I think we should use the helm chart to generate the manifests that we keep in the deploy folder. We can keep different ones if we want using multiple values files.
As part of this, we should have a CI check that makes sure the manifests match the current iteration of the chart, as well as linting and testing the chart.
The question then becomes, once we go public, does the chart live in reactiveops/charts, or do we keep it here? Or do we keep it here and sync it to charts?
Was trying out 0.2.1 installed through Homebrew as suggested on the Readme. Wanted to try failing polaris in CI/CD when there are any error-level issues hence followed the command with flags advised.
polaris --audit --audit-path ./deploy/ \
--set-exit-code-on-error \
--set-exit-code-below-score 90
However got the error flag provided but not defined: -set-exit-code-on-error
.
$ polaris --version
Polaris version 0.2.1
$ polaris --help
Usage of polaris:polaris
-audit
Runs a one-time audit.
-audit-path string
If specified, audits one or more YAML files instead of a cluster
-config string
Location of Polaris configuration file
-dashboard
Runs the webserver for Polaris dashboard.
-dashboard-base-path string
Path on which the dashboard is served (default "/")
-dashboard-port int
Port for the dashboard webserver (default 8080)
-disable-webhook-config-installer
disable the installer in the webhook server, so it won't install webhook configuration resources during bootstrapping
-display-name string
An optional identifier for the audit
-kubeconfig string
Paths to a kubeconfig. Only required if out-of-cluster.
-log-level string
Logrus log level (default "info")
-master string
The address of the Kubernetes API server. Overrides any value in kubeconfig. Only required if out-of-cluster.
-output-file string
Destination file for audit results
-output-format string
Output format for results - json, yaml, or score (default "json")
-output-url string
Destination URL to send audit results
-version
Prints the version of Polaris
-webhook
Runs the webhook webserver.
-webhook-port int
Port for the webhook webserver (default 9876)
Did I miss anything?
As referenced in our roadmap, we want to add support for a variety of additional resources, starting with those that act as parent resources such as pods, including:
Going here http://localhost:8080/details/security
yields this error:
template: check-details:18:33: executing "head" at <.JSON>: can't evaluate field JSON in type *dashboard.TemplateData
Version 0.1.3.
I've tried running polaris --audit --output-format score
and polaris --audit --output-format yaml > report.yaml
from the command line, and I'm seeing this error:
flag provided but not defined: -output-format
The commands work when I run the commands using the main.go
file.
go run main.go --audit --output-format score
go run main.go --audit --output-format yaml > report.yaml
To better support CI/CD and other use cases, we should add a --exit-code
flag (a la git diff
) that will set a non-zero exit code when the audit contains error-level issues.
We'd like to look into OPA as a potential way of defining custom Polaris checks. Only deliverable here is a writeup or design doc.
Currently, polaris
reports a warning of cpu/memory limits and requests missing on initContainers but per https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resources I don't think that setting limits/requests on initContainers is standard / best practice?
Correct me if wrong!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.