kube-logging / logging-operator Goto Github PK

View Code? Open in Web Editor NEW

1.5K 40.0 326.0 78.45 MB

Logging operator for Kubernetes

Home Page: https://kube-logging.dev

License: Apache License 2.0

Go 98.39% Shell 0.59% Dockerfile 0.10% Makefile 0.77% Mustache 0.15%

cloud-native kubernetes kubernetes-operator logging operator

logging-operator's Introduction

Logging operator

The Logging Operator is now a CNCF Sandbox project.

The Logging operator solves your logging-related problems in Kubernetes environments by automating the deployment and configuration of a Kubernetes logging pipeline.

The operator deploys and configures a log collector (currently a Fluent Bit DaemonSet) on every node to collect container and application logs from the node file system.
Fluent Bit queries the Kubernetes API and enriches the logs with metadata about the pods, and transfers both the logs and the metadata to a log forwarder instance.
The log forwarder instance receives, filters, and transforms the incoming the logs, and transfers them to one or more destination outputs. The Logging operator supports Fluentd and syslog-ng as log forwarders.

Your logs are always transferred on authenticated and encrypted channels.

This operator helps you bundle logging information with your applications: you can describe the behavior of your application in its charts, the Logging operator does the rest.

What is this operator for?

This operator helps you bundle logging information with your applications: you can describe the behavior of your application in its charts, the Logging operator does the rest.

Feature highlights

Namespace isolation
Native Kubernetes label selectors
Secure communication (TLS)
Configuration validation
Multiple flow support (multiply logs for different transformations)
Multiple output support (store the same logs in multiple storage: S3, GCS, ES, Loki and more...)
Multiple logging system support (multiple fluentd, fluent-bit deployment on the same cluster)

Architecture

The Logging operator manages the log collectors and log forwarders of your logging infrastructure, and the routing rules that specify where you want to send your different log messages.

The log collectors are endpoint agents that collect the logs of your Kubernetes nodes and send them to the log forwarders. Logging operator currently uses Fluent Bit as log collector agents.

The log forwarder instance receives, filters, and transforms the incoming the logs, and transfers them to one or more destination outputs. The Logging operator supports Fluentd and syslog-ng as log forwarders. Which log forwarder is best for you depends on your logging requirements.

You can filter and process the incoming log messages using the flow custom resource of the log forwarder to route them to the appropriate output. The outputs are the destinations where you want to send your log messages, for example, Elasticsearch, or an Amazon S3 bucket. You can also define cluster-wide outputs and flows, for example, to use a centralized output that namespaced users can reference but cannot modify. Note that flows and outputs are specific to the type of log forwarder you use (Fluentd or syslog-ng).

You can configure the Logging operator using the following Custom Resource Definitions.

Logging - The Logging resource defines the logging infrastructure (the log collectors and forwarders) for your cluster that collects and transports your log messages. It also contains configurations for Fluent Bit, Fluentd, and syslog-ng.
CRDs for Fluentd:
- Output - Defines a Fluentd Output for a logging flow, where the log messages are sent using Fluentd. This is a namespaced resource. See also ClusterOutput. To configure syslog-ng outputs, see SyslogNGOutput.
- Flow - Defines a Fluentd logging flow using filters and outputs. Basically, the flow routes the selected log messages to the specified outputs. This is a namespaced resource. See also ClusterFlow. To configure syslog-ng flows, see SyslogNGFlow.
- ClusterOutput - Defines a Fluentd output that is available from all flows and clusterflows. The operator evaluates clusteroutputs in the controlNamespace only unless allowClusterResourcesFromAllNamespaces is set to true.
- ClusterFlow - Defines a Fluentd logging flow that collects logs from all namespaces by default. The operator evaluates clusterflows in the controlNamespace only unless allowClusterResourcesFromAllNamespaces is set to true. To configure syslog-ng clusterflows, see SyslogNGClusterFlow.
CRDs for syslog-ng (these resources like their Fluentd counterparts, but are tailored to features available via syslog-ng):
- SyslogNGOutput - Defines a syslog-ng Output for a logging flow, where the log messages are sent using Fluentd. This is a namespaced resource. See also SyslogNGClusterOutput. To configure Fluentd outputs, see output.
- SyslogNGFlow - Defines a syslog-ng logging flow using filters and outputs. Basically, the flow routes the selected log messages to the specified outputs. This is a namespaced resource. See also SyslogNGClusterFlow. To configure Fluentd flows, see flow.
- SyslogNGClusterOutput - Defines a syslog-ng output that is available from all flows and clusterflows. The operator evaluates clusteroutputs in the controlNamespace only unless allowClusterResourcesFromAllNamespaces is set to true.
- SyslogNGClusterFlow - Defines a syslog-ng logging flow that collects logs from all namespaces by default. The operator evaluates clusterflows in the controlNamespace only unless allowClusterResourcesFromAllNamespaces is set to true. To configure Fluentd clusterflows, see clusterflow.

See the detailed CRDs documentation.

Quickstart

Follow these quickstart guides to try out the Logging operator!

Install

Deploy Logging Operator with our Helm chart.

Caution: The master branch is under heavy development. Use releases instead of the master branch to get stable software.

Support

If you encounter problems while using the Logging operator the documentation does not address, open an issue or talk to us on the #logging-operator Discord channel.

Documentation

You can find the complete documentation on the Logging operator documentation page 📘

Contributing

If you find this project useful, help us:

Support the development of this project and star this repo! ⭐
If you use the Logging operator in a production environment, add yourself to the list of production adopters.:metal:
Help new users with issues they may encounter 💪
Send a pull request with your new features and bug fixes 🚀

Please read the Organisation's Code of Conduct!

For more information, read the developer documentation.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

logging-operator's People

Contributors

Stargazers

Watchers

Forkers

aaronghent tuapuikia vignajeth-kk wwojcik andypeng2015 kubekit99 tlvenn danielbodnar jasonaliyetti jiaxuanzhou amojow isgasho zalmanzhao nparfait imriss worr el4v kouk csu-anzai csu-xiao-an rimusz-lab velothump malletgu nealf revandarth vshn dinnawin chetansha256 nitinmohan87 zzmzzm90 sasnong wmenjoy judahrand mimikrmvrr fekete-robert etsangsplk flouthoc dududko dewey363 anarcher alibasim86 lukasbar duiniwukenaihe adheipsingh isaac441 salimlou pehlert serialvelocity ethanocentricity alexxed cyrus-mc overdrive3000 cten eric4545 eddycharly amitlt avineshwar c45tr0 stefancplace tamaskozak ecusnir avenging stubrowncloudbees jjaniec mkrupczak3 developgo aluminous joosangkim mariux lparis davidbalazs mseiwald gdzy1987 gmenuel joshvee osela chiltonj jgrexa eppo alekc-forks haphan cuong-nd paynejacob chris-hamper vanveele synehan thatsmrtalbot xiaoruiguo ticketmaster florianginer prozshuai cognologix vhdirk allenmun197 vuxuanlai rdpa moskitone aland-zhang wondersd nickgerace

logging-operator's Issues

Add runtime reload configuration based on inotify

Create configuration structure and example
Add notify loop to dynamically reload configuration on k8s configmap change

S3 output support for AWS instance profile and arn role

Add Loki example

Introduce example to use Loki with logging-operator

Install loki from dependency
Add loki output plugin to fluentd/fluent-bit
Update Readme and create example app

Handle PersistentVolumeClaim update based on StorageClass

Kubernetes 1.11 does not support updating PersistentVolumeClaim at all. Kubernetes 1.12+ enables to update PVC's size if the described storage class supports expandable volumes.
Implement a solution which takes this into consideration.

Update Readme and remove `This recording has been archived`

Describe the bug
Asciinema claims:
This recording has been archived

All unclaimed recordings (the ones not linked to any user account) are automatically archived 7 days after upload.

Expected behavior
See the shell recording.

Feature Request: parity with latest operator-sdk

I realize this might not be high priority, but it'd be great to be able to use the latest operator-sdk to test changes "locally" using operator-sdk up local but it's not currently possible due to how the project is structured:

❯ operator-sdk up local --namespace=default --kubeconfig=XXX
INFO[0000] Running the operator locally.
FATA[0000] failed to determine operator type

osdk is looking for <project_root>/cmd/manager/main.go to determine the project type

Refactor the operator naming

Describe the problem
As we have emerging number of operators we should use naming convention for Operators and it's resources.

Proposed solution
As the domain we should use: banzaicloud.io. (this is a company wide rule)

For the application group we can use: logging or log.

For the configuration (like application logging): config

Example new application config crd: config.logging.banzaicloud.io
Example fluentd specific crd: fluentd.logging.banzaicloud.io

Operator crashes because lack of proper rights

On RBAC enabled clusters the operator fails with

time="2018-11-15T12:35:18Z" level=info msg="Registering plugin: s3"
time="2018-11-15T12:35:18Z" level=info msg="Registering plugin: gcs"
time="2018-11-15T12:35:18Z" level=info msg="Registering plugin: azure"
time="2018-11-15T12:35:18Z" level=info msg="Registering plugin: parser"
time="2018-11-15T12:35:18Z" level=info msg="Gettint current environment: ns: \"pipeline-system\" pod: \"logging-operator-cf6fdc488-489bd\""
time="2018-11-15T12:35:18Z" level=error msg="replicasets.apps \"logging-operator-cf6fdc488\" is forbidden: User \"system:serviceaccount:pipeline-system:logging-operator\" cannot get replicasets.apps in the namespace \"pipeline-system\""
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x60 pc=0xf205db]

goroutine 1 [running]:
main.main()
	/go/src/github.com/banzaicloud/logging-operator/cmd/logging-operator/main.go:55 +0x1fb

How to set this setting for fluentd plugin: log_es_400_reason

Is your feature request related to a problem? Please describe.
To debug error="400 - Rejected by Elasticsearch" errors, I need to set log_es_400_reason to true.

Describe the solution you'd like
An option to set log_es_400_reason via values.yaml

Describe alternatives you've considered
N/A

Additional context

2019-07-26 17:38:07 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch"

Default bufferPath settings are unique and prevent reuse

Describe the bug
When setting up two different pieces of fluentd config via Plugin definitions, both configuring a Forward output, the Fluentd container started crashing with the error
config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ConfigError error="Other 'forward' plugin already use same buffer path: type = forward, buffer path = /buffers/forward"

To Reproduce
Steps to reproduce the behavior:

Create a CR that defines a forward output, do not set bufferPath parameter
Create another CR that also defines a forward output (but don't set bufferPath)
Fluentd will go into a CrashLoop

Expected behavior
Fluentd does not crash and forwards log events to both forward locations.

Additional context
The error seems to be due to a non-unique default value for bufferPath. This seems to affect other output plugins as well. A potential solution might append some unique string to the default path. K8s resource name could be an option of the name value set during output configuration (like in https://github.com/banzaicloud/logging-operator/blob/master/charts/nginx-logging-demo/templates/logging.yaml#L17).

Fluentbit Loki plugin support

More info:
https://github.com/cosmo0920/fluent-bit-go-loki

Please add `reload_on_failure true` and `reload_connections false` to elasticsearch plugin

Is your feature request related to a problem? Please describe.
I am facing the read timeout reached error:

2019-07-31 13:18:45 +0000 [warn]: #0 failed to flush the buffer. retry_time=39 next_retry_seconds=2019-07-31 13:19:17 +0000 chunk="58ef9d0d26d38d31e2579a76d7b3ad21" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elasticsearch-elasticsearch-cluster.default\", :port=>9200, :scheme=>\"https\"}): read timeout reached"

In here and here, it is suggested to add the following to the output config to resolve the issue:

reconnect_on_error true
reload_on_failure true
reload_connections false

The reconnect_on_error true is already there, but reload_on_failure and reload_connections are missing.

Describe the solution you'd like
Please add reload_on_failure true and reload_connections false to the elasticsearch plugin

Describe alternatives you've considered
N/A

Additional context
N/A

Add logging-operator to OperatorHub.io

It would be nice if logging-operator could be packaged and used with Operator Lifecycle Manager. As a side effect you could also promote your Operator, like other BanzaiCloud Operators already, on https://operatorhub.io

Instructions about packaging can be found here https://operatorhub.io/contribute and in more detail here https://github.com/operator-framework/community-operators/blob/master/docs/testing-operators.md#manual-testing-on-kubernetes

fluentd-app-config not updated on CRD deletion

I have deleted the CRD and was expecting fluentd-app-config to be updated.

kubectl delete -f nginx-crd.yml 
loggingoperator.logging.banzaicloud.com "nginx-logging" deleted

The logs show:

ERROR: logging before flag.Parse: E1025 04:09:16.303748       1 memcache.go:153] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E1025 04:10:16.304046       1 memcache.go:153] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E1025 04:11:16.303040       1 memcache.go:153] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
time="2018-10-25T04:11:17Z" level=info msg="New CRD arrived &v1alpha1.LoggingOperator{TypeMeta:v1.TypeMeta{Kind:\"LoggingOperator\", APIVersion:\"logging.banzaicloud.com/v1alpha1\"}, ObjectMeta:v1.ObjectMeta{Name:\"nginx-logging\", GenerateName:\"\", Namespace:\"\", SelfLink:\"/apis/logging.banzaicloud.com/v1alpha1/loggingoperators/nginx-logging\", UID:\"050a3be7-d80c-11e8-861e-066f61791264\", ResourceVersion:\"314920657\", Generation:1, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63676037477, loc:(*time.Location)(0x1998a20)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{\"kubectl.kubernetes.io/last-applied-configuration\":\"{\\\"apiVersion\\\":\\\"logging.banzaicloud.com/v1alpha1\\\",\\\"kind\\\":\\\"LoggingOperator\\\",\\\"metadata\\\":{\\\"annotations\\\":{},\\\"name\\\":\\\"nginx-logging\\\"},\\\"spec\\\":{\\\"filter\\\":[{\\\"name\\\":\\\"parser-nginx\\\",\\\"parameters\\\":[{\\\"name\\\":\\\"format\\\",\\\"value\\\":\\\"/^(?\\\\u003cremote\\\\u003e[^ ]*) (?\\\\u003chost\\\\u003e[^ ]*) (?\\\\u003cuser\\\\u003e[^ ]*) \\\\\\\\[(?\\\\u003ctime\\\\u003e[^\\\\\\\\]]*)\\\\\\\\] \\\\\\\"(?\\\\u003cmethod\\\\u003e\\\\\\\\S+)(?: +(?\\\\u003cpath\\\\u003e[^\\\\\\\\\\\\\\\"]*) +\\\\\\\\S*)?\\\\\\\" (?\\\\u003ccode\\\\u003e[^ ]*) (?\\\\u003csize\\\\u003e[^ ]*)(?: \\\\\\\"(?\\\\u003creferer\\\\u003e[^\\\\\\\\\\\\\\\"]*)\\\\\\\" \\\\\\\"(?\\\\u003cagent\\\\u003e[^\\\\\\\\\\\\\\\"]*)\\\\\\\")?$/\\\"},{\\\"name\\\":\\\"timeFormat\\\",\\\"value\\\":\\\"%d/%b/%Y:%H:%M:%S %z\\\"}],\\\"type\\\":\\\"parser\\\"}],\\\"input\\\":{\\\"label\\\":{\\\"app\\\":\\\"nginx\\\"}},\\\"output\\\":[{\\\"name\\\":\\\"outputS3\\\",\\\"parameters\\\":[{\\\"name\\\":\\\"aws_key_id\\\",\\\"value\\\":\\\"test\\\"},{\\\"name\\\":\\\"aws_sec_key\\\",\\\"value\\\":\\\"test\\\"},{\\\"name\\\":\\\"s3_bucket\\\",\\\"value\\\":\\\"logging-bucket\\\"},{\\\"name\\\":\\\"s3_region\\\",\\\"value\\\":\\\"ap-northeast-1\\\"}],\\\"type\\\":\\\"s3\\\"}]}}\\n\"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:\"\"}, Spec:v1alpha1.LoggingOperatorSpec{Input:v1alpha1.Input{Label:map[string]string{\"app\":\"nginx\"}}, Filter:[]v1alpha1.Plugin{v1alpha1.Plugin{Type:\"parser\", Name:\"parser-nginx\", Parameters:[]v1alpha1.Parameter{v1alpha1.Parameter{Name:\"format\", ValueFrom:(*v1alpha1.ValueFrom)(nil), Value:\"/^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \\\\[(?<time>[^\\\\]]*)\\\\] \\\"(?<method>\\\\S+)(?: +(?<path>[^\\\\\\\"]*) +\\\\S*)?\\\" (?<code>[^ ]*) (?<size>[^ ]*)(?: \\\"(?<referer>[^\\\\\\\"]*)\\\" \\\"(?<agent>[^\\\\\\\"]*)\\\")?$/\"}, v1alpha1.Parameter{Name:\"timeFormat\", ValueFrom:(*v1alpha1.ValueFrom)(nil), Value:\"%d/%b/%Y:%H:%M:%S %z\"}}}}, Output:[]v1alpha1.Plugin{v1alpha1.Plugin{Type:\"s3\", Name:\"outputS3\", Parameters:[]v1alpha1.Parameter{v1alpha1.Parameter{Name:\"aws_key_id\", ValueFrom:(*v1alpha1.ValueFrom)(nil), Value:\"test\"}, v1alpha1.Parameter{Name:\"aws_sec_key\", ValueFrom:(*v1alpha1.ValueFrom)(nil), Value:\"test\"}, v1alpha1.Parameter{Name:\"s3_bucket\", ValueFrom:(*v1alpha1.ValueFrom)(nil), Value:\"logging-bucket\"}, v1alpha1.Parameter{Name:\"s3_region\", ValueFrom:(*v1alpha1.ValueFrom)(nil), Value:\"ap-northeast-1\"}}}}}, Status:v1alpha1.LoggingOperatorStatus{}}"
time="2018-10-25T04:11:17Z" level=info msg="Generating configuration."
time="2018-10-25T04:11:17Z" level=info msg="Applying filter"
ERROR: logging before flag.Parse: E1025 04:12:16.303963       1 memcache.go:153] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
ERROR: logging before flag.Parse: E1025 04:13:16.306224       1 memcache.go:153] couldn't get resource list for custom.metrics.k8s.io/v1beta1: <nil>
time="2018-10-25T04:13:34Z" level=info msg="Delete CRD: nginx-logging"
time="2018-10-25T04:13:34Z" level=error msg="configmaps \"fluentd-app-config\" not found"

Support Fluentd out_forward plugin

I'd like to be able to forward the output from fluentd to another fluentd instance using logging-operator (I'm currently sending to elasticsearch). The idea is I have mulitple clusters each with logging-operator and would like to configured the target fluentd with the appropriate/common filters.

Additional outputs

Additional outputs that would be useful:

ElasticSearch
Google Stackdriver
Sumologic

Support for gelf output for graylog

It would be awesome to support/configure out_gelf plugin for fluend, so that logs can be forwarded to a central graylog server.

https://www.fluentd.org/guides/recipes/graylog2

Catch-all example

I have configured an nginx specific "Plugin" which is going to ElasticSearch. How do I configure a catch-all to get all the rest?

Support for multi-container pods

Is your feature request related to a problem? Please describe.
A common deployment scenario is to use sidecar containers (i.e. Istio) that have varying log formats. I see no way of parsing my application container's output while providing alternative formatting for something like the istio-proxy container in the same pod. This is doable via Fluentd, but the operator does not seem to support it. This currently prevents me from adopting the operator.

Describe the solution you'd like
It would be nice to be able to define filters/parsing and outputs at the container level.

Describe alternatives you've considered
N/A

Additional context
This is more of a side note, but one thing I thought of to work around these kinds of issues would be to allow the Plugin resource to define an explicit Fluentd configuration. That would enable users to leverage the operator's Fluentd config aggregation provided by this plugin more readily.

Demo appication

Create simple demo app Helm chart (i.e. nginx)
Define logging CRD for the Demo app
Input by label (app: nginx)
Filter standard parser filter

format: /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
timeFormat: %d/%b/%Y:%H:%M:%S %z

Output to S3 on Amazon

PV is not being created

Describe the bug

Error message when looking at the fluentd deployment.

AttachVolume.Attach failed for volume "pvc-9257a4f3-dc00-11e8-861e-066f61791264" : InvalidVolume.NotFound: The volume 'vol-0e01f05ffc211c477' does not exist. status code: 400, request id: d7130385-fc3e-4070-a0af-d7608e9c634d
Unable to mount volumes for pod "fluentd-77c8668587-szh8k_devops(929cb55f-ba74-11e9-977f-02d8cf9744bc)": timeout expired waiting for volumes to attach or mount for pod "devops"/"fluentd-77c8668587-szh8k". list of unmounted volumes=[buffer]. list of unattached volumes=[config app-config buffer default-token-wxpp9]

To Reproduce
Helm charts deployed as per instructions.

helmfile details:

repositories:
  - name: banzaicloud-stable
    url: https://kubernetes-charts.banzaicloud.com
  - name: banzaicloud-incubator
    url: http://kubernetes-charts-incubator.banzaicloud.com

releases:
  - name: banzaicloud-logging-operator
    namespace: devops
    chart: banzaicloud-stable/logging-operator

  - name: banzaicloud-logging-operator-fluent
    namespace: devops
    chart: banzaicloud-stable/logging-operator-fluent

Expected behavior
Successful deploy without errors.

Refactor plugins registry

Extend plugins for better documentation

add type field to plugin
add description to plugin
add parameter description to plugin
Generate index page for plugins

Fluntd should run with specified servicAccountName

Is your feature request related to a problem? Please describe.
It would be better if flunetd used a specified serviceaAccount. With specified serviceAccount would be easier to define strict PodSecurityPolicy.

Describe the solution you'd like
Fluentd should be run with specified serviceAccount similary to fluent-bit which use logging serviceAccount.

Define the final CRD structure

Define the structure and give examples in README

Inherit labels

We already have a workload with app: fluentd running. The logging-operator interferes with it. The selector for the service is only using app, but should be using release as well.

Kubebuilder v2

Evaluate new kubebuilder version https://github.com/kubernetes-sigs/kubebuilder/releases

Fluentd replica count should be exposed as a configurable parameter in the fluentd cr.

Currently, the replica count for the fluentd is hardcoded as 1 in the deployment.

It should be exposed in the fluentd cr as a configurable parameter.

return &appsv1.Deployment{ ObjectMeta: templates.FluentdObjectMeta(deploymentName, util.MergeLabels(r.Fluentd.Labels, labelSelector), r.Fluentd), Spec: appsv1.DeploymentSpec{ Replicas: util.IntPointer(1),

Logging-operator should install the Fluentd app config to the right namespace

Describe the bug
The operator installs the fluentd app config to a wrong namespace
To Reproduce
Install the logging operator helm chart to the logging namespace.
Install the nginx-logging-es-demo chart to a different namespace.

Expected behavior
Operator should install the config to the namespace where the fluentd is present.

Removing chart doesn't clear artifacts

Removing the helm chart from the cluster leaves the fluentd deployment and config maps.

My expectation would be that they are being cleaned up.

Geoip module

Is your feature request related to a problem? Please describe.
I would like to use the geoip module.

Describe the solution you'd like

    - type: geoip
      name: geoip-nginx
      parameters:
        - name: geoip_lookup_keys
          value: host

Error: unable to decode "": no kind "Fluentbit" is registered for version "logging.banzaicloud.com/v1alpha1"

I have tried installing logging-operator in one ec2 machine, followed the steps which are specified in below URL.
https://github.com/banzaicloud/logging-operator#example-logging-operator-with-elasticsearch-operator

Issue faced
when I tried to do helm install I have received below error message.

helm install --name loggingo banzaicloud-stable/logging-operator --debug 
[debug] Created tunnel using local port: '41795'

[debug] SERVER: "127.0.0.1:41795"

[debug] Original chart version: ""
[debug] Fetched banzaicloud-stable/logging-operator to /home/centos/.helm/cache/archive/logging-operator-0.1.7.tgz

[debug] CHART PATH: /home/centos/.helm/cache/archive/logging-operator-0.1.7.tgz

Error: unable to decode "": no kind "Fluentbit" is registered for version "logging.banzaicloud.com/v1alpha1"

$ kubectl version --short
Client Version: v1.14.1
Server Version: v1.14.0

Problem with fluentd - failed to reconcile resource: unexpected resource type","errorVerbose"

log from logging-operator pod

{"level":"error","ts":1551878526.2104297,"logger":"kubebuilder.controller","msg":"Reconciler error","controller":"fluentd-controller","request":"monitoring/logging-operator-fluentd","error":"failed to reconcile resource: unexpected resource type","errorVerbose":"unexpected resource type\nfailed to reconcile resource\ngithub.com/banzaicloud/logging-operator/pkg/resources/fluentd.(*Reconciler).Reconcile\n\t/go/src/github.com/banzaicloud/logging-operator/pkg/resources/fluentd/fluentd.go:66\ngithub.com/banzaicloud/logging-operator/pkg/controller/fluentd.(*ReconcileFluentd).Reconcile\n\t/go/src/github.com/banzaicloud/logging-operator/pkg/controller/fluentd/fluentd_controller.go:107\ngithub.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:215\ngithub.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1333","stacktrace":"github.com/banzaicloud/logging-operator/vendor/github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:217\ngithub.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158\ngithub.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\ngithub.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134\ngithub.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/banzaicloud/logging-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

k8s version: 1.11.7
version affected: 0.1.0 and 0.1.1

Support Alibaba OSS storage in logging operator

Provide serviceAccountName to configuration

Describe the problem
We should provide an easy way to manage the RBAC configuration for fluent-bit and fluentd deployment.

Proposed solution
Provide a service account override configuration option to skip the RBAC creation part. Although we should provide usable defaults if not set.

Please add dedot for Kubernetes labels: ERROR: Can't merge a non object mapping [kubernetes.labels.app] with an object mapping [kubernetes.labels.app]

Is your feature request related to a problem? Please describe.
When there is "." on the labels, ES returns 400 with this error message:

ERROR: Can't merge a non object mapping [kubernetes.labels.app] with an object mapping [kubernetes.labels.app]

Describe the solution you'd like
There is a filter provided in this issue that replaces the dots in the labels. Please add something similar to the configmaps.

Describe alternatives you've considered
This PR. However, it is seems that it is inactive (closed).

Additional context
N/A

Deploy operator via Helm chart

Basic helm chart
Start without any configuration needed
Customisable configuration with README

Add plugins to image

fluent-plugin-record-reformer
fluent-plugin-sumologic_output

Enable Instance Profile and ARN configuration to S3 output plugin

Is your feature request related to a problem? Please describe.
Enable Instance Profile and ARN configuration to S3 output plugin

Describe the solution you'd like
Update the S3 output template.

Logging operator reports CrashLoopBackOff when I try the steps from README to install the operator

Describe the bug

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
logging-operator-795f4f66d4-pr4tb 0/1 CrashLoopBackOff 5 5m33s

kubectl logs logging-operator-795f4f66d4-pr4tb
/usr/local/go/src/runtime/asm_amd64.s:1333
panic: assignment to entry in nil map [recovered]
panic: assignment to entry in nil map
{"level":"info","ts":1551810833.3330257,"logger":"controller_plugin","msg":"Reconciling Plugin","Request.Namespace":"default","Request.Name":"nginx-logging"}
time="2019-03-05T18:33:53Z" level=info msg="Applying filter"
E0305 18:33:53.434889 1 runtime.go:69] Observed a panic: "assignment to entry in nil map" (assignment to entry in nil map)

To Reproduce
Steps to reproduce the behavior:
Follow the exact steps of readme for logging operator.

Expected behavior
Operator pod
Fluentbit and Fluentd daemon sets should have been created
kubectl get pods should have returned all the created resources.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

s3_endpoint documentation and example for #78

Please see pull-request #93.

Essentially, this is just additional documentation for the s3_endpoint variable in the s3-output chart.

I added additional documentation for the s3_endpoint variable defined in #78 (213e9a8) and banzaicloud/banzai-charts#528. I will be submitting another PR to the banzaicloud/banzai-charts.

I want to use Wasabi instead of Amazon, and noticed the feature existed but appeared undocumented.

Fluentbit cannot connect to fluentd if release tag is set for fluentd in the chart

This commit introduced the error: 4e8912b

Errors in fluentbit logs "[error] [out_fw] no upstream connections available"

Describe the bug
Tried running logging-operator on :
(1). Minikube and
(2). a 5 Node cluster.

On both the setups, the logs finally don't make way to S3, when I run a nginx app with correct label as shown in example.

Digging further, Issue is not with the upload, but with the fluent-bit.

To Reproduce
Steps to reproduce the behavior:

Follow the README and bring up the logging-operator and setup the plugin.
Create a NGINX app. For instance : https://raw.githubusercontent.com/Kurento/Kubernetes/master/nginx-deployment-service.yaml
Logs are generated:
{"log":"192.xxx.xx.x - - [08/Mar/2019:01:21:08 +0000] \"GET / HTTP/1.1\" 304 0 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36\" \"-\"\n","stream":"stdout","time":"2019-03-08T01:21:08.365242215Z"}
But even before doing step 2, the fluent-bit has not been setup properly and errors are seen in logs :

root@master-node:~/code/logging-operator# k8 get pods
NAME                                READY   STATUS    RESTARTS   AGE
fluent-bit-daemon-4h7rp             1/1     Running   0          143m
fluent-bit-daemon-8h4nt             1/1     Running   0          143m
fluent-bit-daemon-rjkv9             1/1     Running   0          143m
fluent-bit-daemon-w4lg5             1/1     Running   0          143m
logging-operator-79b48dbc4b-j9jzs   1/1     Running   0          144m
test-fluentd-6cc8b74749-6rgtk       2/2     Running   0          94m
....
root@master-node:~/code/logging-operator#  kubectl logs fluent-bit-daemon-w4lg5 
^[[1mFluent Bit v1.0.4^[[0m
 ^[[1m^[[93mCopyright (C) Treasure Data^[[0m

 [2019/03/08 00:17:32] [ info] [storage] initializing...
 [2019/03/08 00:17:32] [ info] [storage] in-memory
 [2019/03/08 00:17:32] [ info] [storage] normal synchronization mode, checksum disabled
 [2019/03/08 00:17:32] [ info] [engine] started (pid=1)
 [2019/03/08 00:17:32] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
 [2019/03/08 00:17:32] [ info] [filter_kube] local POD info OK
 [2019/03/08 00:17:32] [ info] [filter_kube] testing connectivity with API server...
 [2019/03/08 00:17:33] [ info] [filter_kube] API server connectivity OK
 [2019/03/08 00:17:33] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
 [2019/03/08 00:17:34] [ warn] net_tcp_fd_connect: getaddrinfo(host='fluentd.default.svc'): Name or service not known
 [2019/03/08 00:17:34] [error] [out_fw] no upstream connections available
 [2019/03/08 00:17:34] [ warn] net_tcp_fd_connect: getaddrinfo(host='fluentd.default.svc'): Name or service not known
 [2019/03/08 00:17:34] [error] [out_fw] no upstream connections available
 [2019/03/08 00:17:34] [ warn] net_tcp_fd_connect: getaddrinfo(host='fluentd.default.svc'): Name or service not known
 [2019/03/08 00:17:34] [error] [out_fw] no upstream connections available
 [2019/03/08 00:17:34] [ warn] net_tcp_fd_connect: getaddrinfo(host='fluentd.default.svc'): Name or service not known
.....
.....
.....
.....

Given that it is looking for getaddrinfo(host='fluentd.default.svc'), do we expect the running host/Pod to be named this ? Will this mean that we need a multi-node cluster (which I have anyways tried, without luck).

Appreciate the inputs. Thanks.

Expected behavior
Fluent-bit should have made the connection and picked up the logs.

Screenshots

Additional context

Fluentd initialisation vs CRD is a race condition now

Describe the problem
Fluentd creates the fluentd-app-config which happens after first CRD request

time="2018-08-28T08:20:32Z" level=error msg="configmaps "fluentd-app-config" not found"
time="2018-08-28T08:20:32Z" level=error msg="configmaps "fluentd-app-config" not found"
time="2018-08-28T08:20:32Z" level=info msg="Fluent-bit deployed successfully"
time="2018-08-28T08:20:32Z" level=info msg="Trying to init fluentd"
time="2018-08-28T08:20:32Z" level=info msg="Fluentd Deployment does not exists!"
time="2018-08-28T08:20:32Z" level=error msg="deployments.extensions "fluentd" not found"
time="2018-08-28T08:20:32Z" level=info msg="Fluentd Deployment initialized!"

Proposed solution
Make fluentd-app-config CreateOrUpdate in both context

Please add `replace_invalid_sequence` to the `parser` plugin's variables

Is your feature request related to a problem? Please describe.
I need to set replace_invalid_sequence to true per parser documentations to have invalid strings replaced with safe characters and re-parsed it.

Describe the solution you'd like
To add replace_invalid_sequence to the list of variables for parser plugin.

Describe alternatives you've considered
N/A

Additional context
This could be related to #112

EndToEnd Example with helm chart

Describe the solution you'd like
Create a full end to end example base on to BanzaiCloud logging-operator helm chart, with nginx and S3 storage backend.

enable logging operator chart to configure fluentd-config config map

I would like to be able to customise the fluentd instance installed by logging operator.
Eg. I would like to add some tags as following

    <filter *.**>
      @type record_transformer
      <record>
        cluster "mgmt.cluster"
        tag ${tag}
      </record>
    </filter>

I would like to be able to specify custom values in the fluentd-config config map (above filter).

Logging-operator - fluentd - make the output frequency configurable

The default behaviour of fluentd is to write events to the output (eg.: s3) hourly.
We need to make the output frequency configurable.

Reference:

https://docs.fluentd.org/v1.0/articles/buffer-section#time

Implement addOwnerRefToObject(...)

Describe the problem
The problem is that we create resources when initialising the operator without any CRD defined. Deleting the operator should cascade delete these resources.

Proposed solution

We set the owner reference to the initialised resources. For this we need to Query the ObjectReference to the Operator deployment from the Kubernetes API and set this as Owner.
We move the creation of fluentd and fluent-bit to CRDs. This way we can set these resources ID's as owner references. The deletions will be initialised when somone deletes the CRD definition.

Feature Request: ability to override default docker registry

Was trying to test this out inside a very locked down VPC on AWS EKS but it can't connect to outside world to get busybox, which I notice is hard-coded. It'd be cool to be able to override this and other image URLs in order to customize things etc.

Automated Test

Create simple end-to-end test case
Create make test command to run end-to-end test with docker4mac or minikube
Update CircleCI to run tests

kube-logging / logging-operator Goto Github PK

logging-operator's Introduction

Logging operator

What is this operator for?

Feature highlights

Architecture

Quickstart

Install

Support

Documentation

Contributing

License

logging-operator's People

Contributors

Stargazers

Watchers

Forkers

logging-operator's Issues

Screenshots

Additional context

Recommend Projects

Recommend Topics

Recommend Org