aws / amazon-cloudwatch-logs-for-fluent-bit Goto Github PK
View Code? Open in Web Editor NEWA Fluent Bit output plugin for CloudWatch Logs
License: Apache License 2.0
A Fluent Bit output plugin for CloudWatch Logs
License: Apache License 2.0
Log groups can be tagged on creation (see API reference) - this can be helpful for cost allocation or other purposes. It may be convenient if this plugin could tag log groups that are created when the auto_create_group
option is set to true
.
The one wrinkle with implementing this is that I'm not sure exactly how well the fluent bit config file format works with embedded maps (since we'd likely have a config option like log_group_tags
that would contain a map of tag key-values).
I have setup a k8s install according to Reducing the Log Volume From Fluent Bit (Optional) of here https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-logs-FluentBit.html
Using the configuration file in the URL: i.e.
fluent-bit.conf: |
[SERVICE]
Flush 5
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server ${HTTP_SERVER}
HTTP_Listen 0.0.0.0
HTTP_Port ${HTTP_PORT}
storage.path /var/fluent-bit/state/flb-storage/
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 5M
@INCLUDE application-log.conf
@INCLUDE dataplane-log.conf
# @INCLUDE host-log.conf
application-log.conf: |
[INPUT]
Name tail
Tag application.*
Exclude_Path /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
Path /var/log/containers/*.log
Docker_Mode On
Docker_Mode_Flush 5
Docker_Mode_Parser container_firstline
Parser docker
DB /var/fluent-bit/state/flb_container.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Rotate_Wait 30
storage.type filesystem
Read_from_Head ${READ_FROM_HEAD}
[INPUT]
Name tail
Tag application.*
Path /var/log/containers/cloudwatch-agent*
Docker_Mode On
Docker_Mode_Flush 5
Docker_Mode_Parser cwagent_firstline
Parser docker
DB /var/fluent-bit/state/flb_cwagent.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
Read_from_Head ${READ_FROM_HEAD}
[FILTER]
Name kubernetes
Match application.*
Kube_URL https://kubernetes.default.svc:443
Kube_Tag_Prefix application.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Labels Off
Annotations Off
[OUTPUT]
Name cloudwatch_logs
Match application.*
region ${AWS_REGION}
log_group_name /aws/containerinsights/${CLUSTER_NAME}/application
log_stream_prefix ${HOST_NAME}-
auto_create_group true
extra_user_agent container-insights
dataplane-log.conf: |
[INPUT]
Name systemd
Tag dataplane.systemd.*
Systemd_Filter _SYSTEMD_UNIT=docker.service
DB /var/fluent-bit/state/systemd.db
Path /var/log/journal
Read_From_Tail ${READ_FROM_TAIL}
[INPUT]
Name tail
Tag dataplane.tail.*
Path /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
Docker_Mode On
Docker_Mode_Flush 5
Docker_Mode_Parser container_firstline
Parser docker
DB /var/fluent-bit/state/flb_dataplane_tail.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
Rotate_Wait 30
storage.type filesystem
Read_from_Head ${READ_FROM_HEAD}
[FILTER]
Name modify
Match dataplane.systemd.*
Rename _HOSTNAME hostname
Rename _SYSTEMD_UNIT systemd_unit
Rename MESSAGE message
Remove_regex ^((?!hostname|systemd_unit|message).)*$
[FILTER]
Name aws
Match dataplane.*
imds_version v1
[OUTPUT]
Name cloudwatch_logs
Match dataplane.*
region ${AWS_REGION}
log_group_name /aws/containerinsights/${CLUSTER_NAME}/dataplane
log_stream_prefix ${HOST_NAME}-
auto_create_group true
extra_user_agent container-insights
parsers.conf: |
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
[PARSER]
Name syslog
Format regex
Regex ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
[PARSER]
Name container_firstline
Format regex
Regex (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
[PARSER]
Name cwagent_firstline
Format regex
Regex (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%LZ
But I am still getting a lot of messages in cloudwatch like "Sent 1 events to CloudWatch" which are not very useful. Sorry if it is a stupid quesiton, but how do I disable these messages?
Hi,
Fluent Bit v1.3 was recently released and I have been looking forwards to making use of one of the features in that version (collectd support). I was just wondering if creating a new release of the docker image was on the roadmap and what the timeline might look like. Thanks.
I am getting lots of errors on an initial log sync:
time="2019-11-25T11:46:49Z" level=error msg="[cloudwatch 0] InvalidParameterException: The batch of log events in a single PutLogEvents request cannot span more than 24 hours.\n\tstatus code: 400, request id: 9e371dc0-5efe-4803-91d9-e779cb6bc165\n"
Possible solution is an appropriate duration limiter, it can be introduced here:
I see that v1.6 of upstream/vanilla Fluent Bit now includes its own output plugin for writing to CloudWatch, and says this:
This is the documentation for the core Fluent Bit CloudWatch plugin written in C. It can replace the aws/amazon-cloudwatch-logs-for-fluent-bit Golang Fluent Bit plugin released last year.
...
Check the amazon repo for the Golang plugin for details on the deprecation/migration plan for the original plugin.
However, I don't see any corresponding deprecation notice here. So, to clarify:
Possible add output examples at https://github.com/fluent/fluent-bit-kubernetes-logging
I tried spinning up a task and use firelens fluent-bit sidecar. However, I am seeing errors due to which containers are not starting up. I cannot see logs in CloudWatch because fluent-bit container is failing too. To understand what is going on, I changed log-driver to awslogs on all containers, but then I get the following error when creating ECS task-definition.
An error occurred (ClientException) when calling the RegisterTaskDefinition operation: When a firelensConfiguration object is specified, at least one container has to be configured with the awsfirelens log driver.
Is it necessary to enforce this validation? Just curious.
I am trying to route my AWS Fargate container app's logs to two outputs - that being in our case Cloudwatch and Datadog. So the same log event in json must be routed at the same time to 2 outputs
The current setup is using ECS fargate containers with firelens enabled for streaming the logs to datadog. The single [output] in this case Datadog is easy. Add the config in your task-definition and routes the logs. The trick comes in when you want to route that same log-event to cloudwatch and datadog
Would we be able to route the same log-file to two different outputs?
Cloudwatch now supports automatically generating metrics from logs sent in a custom format.
It would be great if this plugin provided support for sending logs to CloudWatch in the expected format.
Would love to see support for built-in fluent-bit plugins (e.g. in_cpu) as well as well as for logs coming from an application with application specific metrics.
With the new templating feature, in ECS FireLens you might be tempted to do something like:
log_stream_name something-$(ecs_task_arn)
But this leads to an error:
INFO[0008] [cloudwatch 0] Log group test already exists
ERRO[0008] [cloudwatch 0] InvalidParameterException: 1 validation error detected: Value 'test-arn:aws:ecs:ap-south-1:144718711470:task/737d73bf-8c6e-44f1-aa86-7b3ae3922011' at 'logStreamName' failed to satisfy constraint: Member must satisfy regular expression pattern: [^:*]*
Because the colon character makes it an invalid stream name.
Ideally, the plugin should just strip colons from the names in this case.
I'm using parser feature of fluent bit for sending my nginx logs to cloudwatch. sometimes nginx container produces some extra logs.
I couldn't find any way to prevent sent these garbage logs to cloudwatch.
It should be better if I can ignore sending a specific key to cloudwatch.
something like this
log_key_ignore log
We started running the daemonset to our dev workload, we have lesser logs there and within a few minutes of start it started giving exception as
stream processor started
ThrottlingException: Rate exceeded
status code: 400, request id: 3b8372dc-ab59-11e9-bbcf-6bdcac200b1d
our config
fluent-bit.conf: |
[SERVICE]
Parsers_File parsers.conf
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-firehose.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*_app_*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Annotations Off
output-firehose.conf: |
[OUTPUT]
Name cloudwatch
Match **
region ap-south-1
log_group_name /logs/application
log_stream_name raw
log_key log
role_arn arn:aws:iam::xx:role/xx-dev-fluentd
auto_create_group true
parsers.conf: |
[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
Also couldn't figure out how to make log stream by container name?
Can we support cloudwatch input for fluentbit. We already have something equivalent for fluentd.
With the input plugin we can leverage the fluentbit sql streaming.
Hi, just fyi, I created a helm chart for your fluent-bit plugin.
helm/charts#15830
Would you mind checking the content, documentation and references?
Cheers!
This statement should be removed because it shows up so much in the logs that it crowds out other output. I was running fluent bit in kubernetes and this message made it impossible for me to see the actual useful debug logs:
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:45Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
time="2021-01-11T02:17:46Z" level=debug msg="[cloudwatch 0] Get ECS Metadata: {Cluster: TaskARN: TaskID:}\n" func="github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent()" file="cloudwatch.go:353"
I get the following panic message on one of our EKS clusters (all the clusters have the same fluent-bit config but only this one is getting a panic). It uses the "compatible" config described in AWS doc which currently uses the amazon/aws-for-fluent-bit:2.10.0
docker image.
panic: reflect: call of reflect.Value.Index on int64 Value
goroutine 17 [running, locked to thread]:
reflect.Value.Index(0x7f4ccb2e5840, 0x1c0008cdad8, 0x86, 0x0, 0x0, 0x7f4ccb366da0, 0x1c001234c90)
/home/.gimme/versions/go1.13.linux.amd64/src/reflect/value.go:966 +0x1c8
github.com/fluent/fluent-bit-go/output.GetRecord(0x1c0014f0050, 0x1c0011c5c70, 0x1, 0x1c00125c9e0, 0x1c001234b70)
/home/go/pkg/mod/github.com/fluent/[email protected]/output/decoder.go:80 +0x106
main.FLBPluginFlushCtx(0x7f4cbc86d0e8, 0x7f4cd32050b0, 0x7f4c00068d24, 0x7f4cb76761e0, 0x28)
/cloudwatch/fluent-bit-cloudwatch.go:174 +0x1e4
main._cgoexpwrap_19a10b653c9e_FLBPluginFlushCtx(0x7f4cbc86d0e8, 0x7f4cd32050b0, 0x68d24, 0x7f4cb76761e0, 0x0)
_cgo_gotypes.go:88 +0x49
Potentially relates to fluent/fluent-bit-go#34 and fluent/fluent-bit-go#29 which seem to have been fixed over 8 months ago but the fix never made it to this repo.
In the context of service-mesh setups like App Mesh, task or pod network is not ready to send external traffic until Envoy proxy is ready. This results in connect failures. Container launch ordering should probably work (could not find in documentation), but it is not currently possible in Kubernetes environment.
To address, I would recommend this process to retry before giving up.
In may cases the log_group
is a factor of what this is logging, eg instances/$SERVICENAME
I wonder if there is a way to interpolate that or make log_group
get that from a key.
Thanks
Thank you for this.
We currently mirror this repository so we can build the plug-in for use in ec2 instances without docker.
For EKS we use the AWS provided docker image.
Would it be possible to do GitHub release of binaries along with the existing docker images?
If there is a public release process that is amenable to a PR am happy to do so.
When running the cloudwatch output plugin with fluent bit 1.6.8. It is getting stuck in a crash loop.
This is on a raspberry pi 3 running raspbian. I compiled the cloudwatch plugin from source 1.6.0 tag using go 1.12.
The system logs show this stack trace:
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: Fluent Bit v1.6.8
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: * Copyright (C) 2019-2020 The Fluent Bit Authors
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: * Copyright (C) 2015-2018 Treasure Data
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: * https://fluentbit.io
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] Configuration:
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] flush time | 5.000000 seconds
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] grace | 5 seconds
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] daemon | 0
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] ___________
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] inputs:
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] systemd
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] ___________
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] filters:
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] rewrite_tag.0
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] record_modifier.1
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] record_modifier.2
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] modify.3
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] ___________
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] outputs:
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] cloudwatch.0
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] ___________
Dec 11 09:57:47 logr-edge-7 td-agent-bit[15876]: [2020/12/11 09:57:47] [ info] collectors:
....
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x751c86d0]
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: goroutine 17 [running, locked to thread]:
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: runtime/internal/atomic.goLoad64(0x4468410c, 0x0, 0x0)
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: /usr/local/go/src/runtime/internal/atomic/atomic_arm.go:127 +0x1c
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: github.com/valyala/bytebufferpool.(*Pool).Get(0x44684064, 0x33)
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: /home/enabled/.go/pkg/mod/github.com/valyala/[email protected]/pool.go:54 +0x5c
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).setGroupStreamNames(0x44684000, 0x445f6630)
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: /home/enabled/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch/cloudwatch.go:503 +0x84
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch.(*OutputPlugin).AddEvent(0x44684000, 0x445f6630, 0x75b85c68)
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: /home/enabled/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch/cloudwatch.go:356 +0x238
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: main.FLBPluginFlushCtx(0x6449c090, 0x6440c052, 0x1cf, 0x6446b340, 0x75224d50)
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: /home/enabled/amazon-cloudwatch-logs-for-fluent-bit/fluent-bit-cloudwatch.go:191 +0x2b8
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: main._cgoexpwrap_19a10b653c9e_FLBPluginFlushCtx(0x6449c090, 0x6440c052, 0x1cf, 0x6446b340, 0x0)
Dec 11 09:57:52 logr-edge-7 td-agent-bit[15876]: _cgo_gotypes.go:86 +0x34
Dec 11 09:57:52 logr-edge-7 systemd[1]: td-agent-bit.service: Main process exited, code=killed, status=6/ABRT
Dec 11 09:57:52 logr-edge-7 systemd[1]: td-agent-bit.service: Unit entered failed state.
Dec 11 09:57:52 logr-edge-7 systemd[1]: td-agent-bit.service: Failed with result 'signal'.
Any help as to what could be causing the segfault would be greatly appreciated.
I was wondering if there is a way to have the cloudwatch log stream within a log group rotate hourly?
For example, if i could define the log_stream_prefix
to be service-%MM-%DD-%HH
and have the log stream name contain the month, day, year?
The goal is I would like the cloudwatch log group to have a stream per hour.
Thanks.
When running fluent-bit in background mode (i.,e Daemon set to yes in the conf file) and with the output plugin set to cloudwatch, the fluent-bit process seems to be always sleeping. I didn't see this issue when the output plugin was set to, say, the file plugin.
Doing a strace on the process shows it's stuck at:
epoll_pwait(1,
Is this expected, or am I doing something wrong?
Other than shipping it as a sidecar inside every application pod, or building a complex filtering configuration with potentially dozens of instances of the cloudwatch logs output plugin that is dependent on informing every node of every possible application that may or may not be scheduled on it in order to decide ahead of time where logs should go, there appears to be no way to use this plugin to route logs for multiple applications to separate cloudwatch log groups based on the applications.
A perfect solution to my problem would be directly configuring the log group based on kubernetes annotations similar to how they have integrated this feature, https://docs.fluentbit.io/manual/pipeline/filters/kubernetes#annotation-examples-in-pod-definition, but failing that it would be good to have some basic functionality to allow construction of a log_group + log_stream as the tag, and define a separating character/string/index and have the logs routed to multiple log groups based on their "log group prefix" so to speak.
It should be possible to update the logretention period via the plugin. Trying it out introducing the log_retention_days option for an existing created LogGroup didn't changed the retention policy.
I'm getting invalid time format
error message, but I'm not asking to convert any time, and I don't know where this is coming from. Where is this config coming from? Which field is it trying to parse? How can I change? Thanks a lot.
I'm using the default config from https://github.com/aws/aws-for-fluent-bit/blob/master/configs/parse-json.conf by doing:
"options":{
"config-file-type": "file",
"config-file-value": "/fluent-bit/configs/parse-json.conf"
}
Error message:
[2020/09/11 10:33:56] [ warn] [parser:json] invalid time format %d/%b/%Y:%H:%M:%S %z for '2020-09-11T10:33:56Z'
Service container definition:
{
"name": "${name}",
"image": "${image}",
"essential": true,
"portMappings": [
{
"containerPort": ${port},
"hostPort": ${port}
}
],
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"AWS_Region": "${region}",
"AWS_Auth": "On",
"Name": "es",
"Host": "${es_logs_host}",
"Port": "443",
"tls": "On",
"Index": "logs",
"Logstash_Format": "On",
"Logstash_Prefix": "logs"
}
}
}
Firelens container definition:
{
"name": "log_router",
"image": "docker.io/amazon/aws-for-fluent-bit:latest",
"essential": true,
"firelensConfiguration": {
"type": "fluentbit",
"options":{
"config-file-type": "file",
"config-file-value": "/fluent-bit/configs/parse-json.conf"
}
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "${name}_log_router",
"awslogs-region": "${region}",
"awslogs-stream-prefix": "ecs"
}
},
"memoryReservation": 50
}
@davidnewhall's awesome new formatting code means you can make the log stream and group names pretty much anything.
But what if something goes wrong?
As shown here, the plugin will make the names using the keys it couldn't find: #16 (comment)
This is probably ok. But the ideal user experience would be to have a fallback log stream or group names that would be used if the necessary keys are not found in the logs.
I am using multiple OUTPUT configuration to cloudwatch.
Then, same metrics for name="cloudwatch.1"
appears several times in prometheus output, and it seems metrics name=cloudwatch.2
is some how mislabeld name=cloudwatch.1
Example output is as follows:
...
fluentbit_output_proc_records_total{name="cloudwatch.0"} 4 1569458812274
fluentbit_output_proc_bytes_total{name="cloudwatch.0"} 112 1569458812274
fluentbit_output_errors_total{name="cloudwatch.0"} 0 1569458812274
fluentbit_output_retries_total{name="cloudwatch.0"} 0 1569458812274
fluentbit_output_retries_failed_total{name="cloudwatch.0"} 0 1569458812274
fluentbit_output_proc_records_total{name="cloudwatch.1"} 4 1569458812274
fluentbit_output_proc_bytes_total{name="cloudwatch.1"} 112 1569458812274
fluentbit_output_errors_total{name="cloudwatch.1"} 0 1569458812274
fluentbit_output_retries_total{name="cloudwatch.1"} 0 1569458812274
fluentbit_output_retries_failed_total{name="cloudwatch.1"} 0 1569458812274
fluentbit_output_proc_records_total{name="cloudwatch.1"} 8 1569458812274
fluentbit_output_proc_bytes_total{name="cloudwatch.1"} 224 1569458812274
fluentbit_output_errors_total{name="cloudwatch.1"} 0 1569458812274
fluentbit_output_retries_total{name="cloudwatch.1"} 0 1569458812274
fluentbit_output_retries_failed_total{name="cloudwatch.1"} 0 1569458812274
...
I used following configuration to get the output.
[SERVICE]
Flush 5
HTTP_Server on
HTTP_Port 2020
[INPUT]
Name dummy
Dummy {"message":"dummy-A"}
Tag input-A
[INPUT]
Name dummy
Dummy {"message":"dummy-B"}
Tag input-B
[OUTPUT]
Name cloudwatch
Match input-A
log_group_name input-A
log_stream_name ${HOSTNAME}
auto_create_group true
region ${AWS_REGION}
[OUTPUT]
Name cloudwatch
Match input-B
log_group_name input-B
log_stream_name ${HOSTNAME}
auto_create_group true
region ${AWS_REGION}
[OUTPUT]
Name cloudwatch
Match *
log_group_name input-AB
log_stream_name ${HOSTNAME}
auto_create_group true
region ${AWS_REGION}
Hi,
we are using this tool in K8s as Daemonset and get the following error sometimes (it depends on which node the pod is running and which other pods are running on this node):
time="2020-08-26T08:24:17Z" level=error msg="[cloudwatch 0] InvalidParameterException: Log event too large: 635616 bytes exceeds limit of 262144\n\tstatus code: 400, request id: 9d79f780-3842-4438-a73d-dc6cc54864c8\n" [2020/08/26 08:24:17] [ warn] [engine] chunk '1-1598430244.412167060.flb' cannot be retried: task_id=2, input=tail .0 > output=cloudwatch.0
So I know that there is this limit in AWS CW: https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CalculatePutEventsEntrySize.html
Is it possible to handle this issue (e.g. truncation, ...)?
Best regards,
Albert
the chart creates a service account, but how does one specify the role arn for that service account? not clear by the docs
Any reason why this is not listed under fluent-bit output plugins official documentation?
As shown in the screenshot below; the log messages are quoted. This is not a huge problem, but ideally they should not be in quotes.
The log value is marshaled because its underlying type is unknown: https://github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/blob/master/cloudwatch/cloudwatch.go#L384
Instead, check if its underlying type is string
or []byte
. If it is, then convert to string/send the string. Otherwise, marshal to JSON.
Hello Maintainers & contributors,
I have a question
Is there a way to specify delay interval ?
For Example : Instead of Pushing logs realtime . i would like to push logs to cloudwatch every 10 minutes or grater
Hi
How can i set log_stream_name as <pod_name><container_name><namespace_name> format instead of one we get with prefix like kube.var.log.containers?
Or How can i remove the prefix kube.var.log.containers from the stream name?
FireLens supports automatically creating CloudWatch Logs groups and log stream during start-up. However, if the log groups and streams are deleted afterwards, they will not be recreated by FireLens. All the logs will be lost.
Currently, log groups created by this plugin do not have a retention period set, meaning logs forwarded using this plugin will never expire from CloudWatch. But you may want to set a lower retention period for cost-saving or compliance reasons. I'm proposing adding a new configuration option named like retentionInDays
that would be applied to log groups created if auto_create_group
is set to true
.
I may try to open a PR about this myself - I think I understand where to make the change. But wanted to open a feature request ticket first to facilitate any discussion.
One potential point of discussion - if we add this setting, should we only apply it to created log groups? Or should it be applied to the log group this plugin sends to regardless? I proposed only created groups to avoid this plugin making changes to existing infrastructure.
One of the key limitations of CloudWatch is that each logging agent must write logs to a unique log stream- if multiple agents are concurrently writing to a single log stream you will soon get sequence token errors.
In most cases, the tag, or some field in the logs can be used to uniquely identify each instance.
However, there are some niche cases where this is not the case:
application.log
and service.log
. Also, for the sake of argument, assume you have no access to an instance ID or unique host name in Fluent Bit. There's nothing for each Fluent Bit instance to use to uniquely identify itself and write to a unique log streamThose cases are a bit contrived... but I do think this problem of "nothing uniquely identifies each fluent bit to allow it to create a different log stream" is a real problem. Not a common problem, but something worth solving if there's a simple solution.
One possible solution is to support a new special template, $(uuid)
which would add a UUID to your log stream or group.
The Fluentd S3 plugin has this feature: https://github.com/fluent/fluent-plugin-s3
The README mentions connection timeouts to Firehose. Is that a copy/paste error, or is there actually a requirement for Kinesis Firehose?
In order to do advance routing with fluentbit, we need access to the options fields in the task-definition side of things, but not available from the console - see image below for a suggestion
The options I am looking at in the task-definition side see image below
Would be really handy if the aws console provides options for either file, s3, and ability to use custom fluentbit images...
Cross posting this here from this issue in the aws-for-fluent-bit
repo. Based on the discussion there this seems to be a regression.
It looks like fluentbit 1.5.6 ignores the auto_create_group
parameter for the AWS Cloudwatch output plugin. For comparison, here's a sample logging of aws-for-fluent-bit 2.6.1:
AWS for Fluent Bit Container Image Version 2.6.1
Fluent Bit v1.5.2
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2020/09/18 08:13:03] [ info] [engine] started (pid=1)
[2020/09/18 08:13:03] [ info] [storage] version=1.0.4, initializing...
[2020/09/18 08:13:03] [ info] [storage] in-memory
[2020/09/18 08:13:03] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/09/18 08:13:03] [ info] [filter:kubernetes:kubernetes.0] https=1 host=xxx.gr7.eu-west-1.eks.amazonaws.com port=443
[2020/09/18 08:13:03] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2020/09/18 08:13:03] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2020/09/18 08:13:03] [ info] [filter:kubernetes:kubernetes.0] API server connectivity OK
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter log_group = '/applications/eks-fluentbit/sandbox-cluster'"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'fluentbit-'"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-west-1'"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter log_key = ''"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter sts_endpoint = ''"
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = "
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''"
[2020/09/18 08:13:03] [ info] [sp] stream processor started
[2020/09/18 08:13:04] [ info] inotify_fs_add(): inode=81808779 watch_fd=1 name=/var/log/containers/app-2048-744b65db67-7vp92_default_app-2048-395f15844e966d1df1d574b7684136c8a155b32da5f07d18cc549c19e5874c2f.log
[2020/09/18 08:13:04] [ info] inotify_fs_add(): inode=95421297 watch_fd=2 name=/var/log/containers/alb-ingress-controller-66dfcf4c7b-wk4bx_kube-system_alb-ingress-controller-67c96153d9a7d84e82c5fbaaec336658d29d31791ecc19a217ad70c2e683999a.log
[2020/09/18 08:13:04] [ info] inotify_fs_add(): inode=38798863 watch_fd=3 name=/var/log/containers/aws-node-7vt9x_kube-system_aws-node-14b94d55972a532cb6962c1d3500afce936fa7bf78cac1b05ca56b78a6f9ece1.log
[2020/09/18 08:13:04] [ info] inotify_fs_add(): inode=38799693 watch_fd=4 name=/var/log/containers/coredns-6658f9f447-jftdc_kube-system_coredns-4b27f73b9ba0dae778c3eba0e899c6753bf817cd11b0b98c2fedaa7c1eda43c2.log
[2020/09/18 08:13:04] [ info] inotify_fs_add(): inode=8465476 watch_fd=5 name=/var/log/containers/metrics-server-6bdf64df8c-9qbmq_kube-system_metrics-server-694a23ea90fde6c4f2dfd5eebf1b91beb8aa58340a1f2ca68787345d53bd196b.log
... happy times ...
An here's one for 2.7.0 in our environment:
AWS for Fluent Bit Container Image Version 2.7.0
Fluent Bit v1.5.6
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2020/09/18 08:19:11] [ info] [engine] started (pid=1)
[2020/09/18 08:19:11] [ info] [storage] version=1.0.5, initializing...
[2020/09/18 08:19:11] [ info] [storage] in-memory
[2020/09/18 08:19:11] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/09/18 08:19:11] [ info] [filter:kubernetes:kubernetes.0] https=1 host=xxx.gr7.eu-west-1.eks.amazonaws.com port=443
[2020/09/18 08:19:11] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2020/09/18 08:19:11] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2020/09/18 08:19:11] [ info] [filter:kubernetes:kubernetes.0] API server connectivity OK
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter log_group = '/applications/eks-fluentbit/sandbox-cluster'"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'fluentbit-'"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_name = ''"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-west-1'"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter log_key = ''"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter new_log_group_tags = ''"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter log_retention_days = '0'"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter sts_endpoint = ''"
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = "
time="2020-09-18T08:19:11Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''"
[2020/09/18 08:19:11] [ info] [sp] stream processor started
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=24126733 watch_fd=1 name=/var/log/containers/app-2048-744b65db67-p9gqg_default_app-2048-c1a892b15f1eec7813e08bb52561895073c6b0706d73f0d5dc6592bdbcbb11f0.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=28402853 watch_fd=2 name=/var/log/containers/ubuntu-toolbox-79bfd58fb8-pwmjs_default_ubuntu-toolbox-940c495f99d2b36b4773dffe912679acb67abc3ee684be6c62ba168e7d8e101e.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=38798125 watch_fd=3 name=/var/log/containers/aws-node-qcq2v_kube-system_aws-node-5e9d32e8025abd0ee567e3cc0259fedbe24eeab16676718d53e71002983ab101.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=14686721 watch_fd=4 name=/var/log/containers/coredns-6658f9f447-wtfcj_kube-system_coredns-384a068a4e42b18a3b8739055507cd4526fe7d69f86b3c1c20a61a5030f633dd.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=263537 watch_fd=5 name=/var/log/containers/vpc-admission-webhook-67646bbf89-xg94g_kube-system_vpc-admission-webhook-a03047e4ff26a078391f4bb3fc76529dd3bcd71341d95c19b3a216a543872f0d.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=68163896 watch_fd=6 name=/var/log/containers/vpc-admission-webhook-deployment-6c4d68f76c-r9x49_kube-system_vpc-admission-webhook-aa186f5137b9b7d6d295710b2e30e29cc2200aeb06b4d1c838c5f883e01e6fa5.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=69262005 watch_fd=7 name=/var/log/containers/vpc-resource-controller-5b5bc46646-vxh4h_kube-system_vpc-resource-controller-b0a9fd2ec356f91674156e1745d681c111fb8bbdf6c1b8a6acbbfb5349680593.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=5255668 watch_fd=8 name=/var/log/containers/kube-proxy-nqc6g_kube-system_kube-proxy-0fbf8e1040048ea863d59cb7b9f74a260d2d36d03b8f6764993b0543c38ec2b8.log
[2020/09/18 08:19:11] [ info] inotify_fs_add(): inode=97625584 watch_fd=9 name=/var/log/containers/fluentbit-f4wls_kube-system_aws-for-fluent-bit-92814b8b79b757088aab1de7842e2e4a82b4a6fea800a06ffb9ff8e68705fab2.log
time="2020-09-18T08:19:17Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: 069f9041-cae2-4d20-8e33-4e5706cfe914"
time="2020-09-18T08:19:17Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: 1103959f-e52e-488f-b547-28a85cd591db"
time="2020-09-18T08:19:17Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: 01c6e885-5ffc-4418-8c87-4f87981217dc"
time="2020-09-18T08:19:17Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: fe14da71-8558-4873-997e-9fa467fa0e92"
time="2020-09-18T08:19:18Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: b6f3cdd8-ef66-4d8d-bdbb-56ff9d854f38"
time="2020-09-18T08:19:18Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: cd0bc34a-e4c8-44a3-ab0c-70e71ec273d6"
time="2020-09-18T08:19:18Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: aac21c1f-e687-4313-93e2-02481ca40919"
time="2020-09-18T08:19:20Z" level=error msg="AccessDeniedException: User: arn:aws:iam::xxx:user/fluentbit-sandbox-clus-eu-west-1 is not authorized to perform: logs:CreateLogGroup on resource: arn:aws:logs:eu-west-1:xxx:log-group:/applications/eks-fluentbit/sandbox-cluster:log-stream:\n\tstatus code: 400, request id: a679e4b3-ccc5-41bf-b070-918c43193cfe"
Not that in the new version the following line is missing:
time="2020-09-18T08:13:03Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'"
Also note that the log_group already exists (it is provisioned via CloudFormation in our environment), so even if the auto_create_group property is not read, it should not attempt to create the group anyway since it already exists.
Using the helm chart from: https://github.com/aws/eks-charts/tree/master/stable/aws-for-fluent-bit
Chart version: 0.1.6
amazon-cloudwatch-logs-for-fluent-bit version: 2.7.0
With the values:
values:
- firehose:
enabled: false
kinesis:
enabled: false
elasticsearch:
enabled: false
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: {{ .Values.role_arn }}
cloudWatch:
logGroupName: /aws/eks/cluster/application
logStreamName: $(kubernetes['namespace_name'])/$(kubernetes['container_name'])
logStreamPrefix:
logRetentionDays: 90
I see log streams show up that contain the namespace name/ container name.
But I also see one log stream with the name kubernetes['namespace_name']/kubernetes['container_name']
The log contents look related to fluent bit:
{"log":"[2021/03/20 00:41:17] [ warn] [input] tail.0 paused (mem buf overlimit)\n","stream":"stderr","time":"2021-03-20T00:41:17.799911083Z"}
...
Can this be removed or renamed?
Example in README is for firehose.
I am using the below config with image 1.2.2
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Parsers_File parsers.conf
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-cloudwatch.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 10
[INPUT]
Name systemd
Tag host.
Systemd_Filter _SYSTEMD_UNIT=docker.service
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=kubeproxy.service
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Kube_Tag_Prefix kube.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
output-cloudwatch.conf: |
[OUTPUT]
Name cloudwatch
Match *
region ${REGION}
log_group_name /eks/${CLUSTER_NAME}/logs
log_stream_prefix eks
auto_create_group true
My fluent-bit DS are crashing with below error:
Fluent Bit v1.2.2
Copyright (C) Treasure Data
Input plugin 'systemd' cannot be loaded
Error: You must specify an output target. Aborting
Why its not able to detect the output for systemd?
Hi,
Any idea about this build issue?
$ make
PATH=/home/jagan/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin golint ./cloudwatch
mkdir -p ./bin
go build -buildmode c-shared -o ./bin/cloudwatch.so ./
fluent-bit-cloudwatch.go:23:2: cannot find package "github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch" in any of:
/usr/lib/go-1.10/src/github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch (from $GOROOT)
/home/jagan/go/src/github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/cloudwatch (from $GOPATH)
fluent-bit-cloudwatch.go:24:2: cannot find package "github.com/aws/amazon-kinesis-firehose-for-fluent-bit/plugins" in any of:
/usr/lib/go-1.10/src/github.com/aws/amazon-kinesis-firehose-for-fluent-bit/plugins (from $GOROOT)
/home/jagan/go/src/github.com/aws/amazon-kinesis-firehose-for-fluent-bit/plugins (from $GOPATH)
fluent-bit-cloudwatch.go:25:2: cannot find package "github.com/fluent/fluent-bit-go/output" in any of:
/usr/lib/go-1.10/src/github.com/fluent/fluent-bit-go/output (from $GOROOT)
/home/jagan/go/src/github.com/fluent/fluent-bit-go/output (from $GOPATH)
fluent-bit-cloudwatch.go:27:2: cannot find package "github.com/sirupsen/logrus" in any of:
/usr/lib/go-1.10/src/github.com/sirupsen/logrus (from $GOROOT)
/home/jagan/go/src/github.com/sirupsen/logrus (from $GOPATH)
Makefile:26: recipe for target 'bin/cloudwatch.so' failed
make: *** [bin/cloudwatch.so] Error 1
¿Are you planning to add Kubernetes metadata to the logs?, eg:
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Keep_Log Off
K8S-Logging.Parser On
K8S-Logging.Exclude On
Or how can I collect all the logs of a namespace and send to a specific log_group. Because with the default configuration all the logs go to the same log_group. Default configuration:
[OUTPUT]
Name cloudwatch_logs
Match *
region us-east-1
log_group_name fluent-bit-cloudwatch
log_stream_prefix from-fluent-bit-
auto_create_group true
Without the k8s metadata in the logs and without being able to send the logs of a namespace to a particular log_group I cannot distinguish the different applications.
Hello all!
When messages are delivered to CW, their order does not match the original order. I see that keys into each message have been sorted alphabetically
2021-04-09T05:57:20.000+03:00
{"az":"us-west-1a","ec2_instance_id":"i-053fe634ac7beb193","log_level":"error","message":"*1088800 recv() failed (104: Co.....}
But the original order of these messages is different
{"log_level":"error","message":"*1088800 recv() failed (104: Co.....", "az":"us-west-1a","ec2_instance_id":"i-053fe634ac7beb193"}
Is this an undocumented feature? Or this behavior issue is relevant to a fluent-bit parser?
Is it possible to disable this behavior? (keys soring)
PS1:
Just have changed the output plugin from Cloudwatch (amazon-cloudwatch-logs-for-fluent-bit) to stdout and got the original order into messages. So, this issue is related to the CW plugin
I tried to setup multiple OUTPUTs for different log groups using Tag
, but both INPUTs events are pushed to the same log group.
This is my configuration:
[SERVICE]
Flush 5
[INPUT]
Name dummy
Dummy {"message":"dummy-A"}
Tag input-A
[INPUT]
Name dummy
Dummy {"message":"dummy-B"}
Tag input-B
[OUTPUT]
Name cloudwatch
Match input-A
log_group_name input-A
log_stream_name ${HOSTNAME}
auto_create_group true
region ${AWS_REGION}
[OUTPUT]
Name cloudwatch
Match input-B
log_group_name input-B
log_stream_name ${HOSTNAME}
auto_create_group true
region ${AWS_REGION}
Log group input-A
is empty.
Log group input-B
got the both INPUTs:
07:41:28 {"message":"dummy-A"}
07:41:28 {"message":"dummy-B"}
07:41:29 {"message":"dummy-A"}
07:41:29 {"message":"dummy-B"}
07:41:30 {"message":"dummy-A"}
07:41:30 {"message":"dummy-B"}
07:41:31 {"message":"dummy-A"}
07:41:31 {"message":"dummy-B"}
...
At the moment I can't find an option for creating kms encrypted log groups, but I think it would be an excellent feature.
It should be possible via golang to create an encrypted log group: https://docs.aws.amazon.com/sdk-for-go/api/service/cloudwatchlogs/#CreateLogGroupInput
Rationale
I know that default_log_group_name
and default_log_stream_name
were introduced specifically as fallbacks in case the variables in log_group_name
and log_stream_name
failed to parse, so I can understand the logic behind having the fallbacks just be "dumb strings", without any variable/templating support.
However, I think there is a benefit to having even the fallbacks support variables -- for example:
optional-var
Both these dynamic log stream names are preferable to a fixed, static name, which is currently my only fallback option.
[OUTPUT]
Name cloudwatch
Match *
region us-east-1
log_group_name test-log-group
auto_create_group true
log_stream_name $(optional-var)
default_log_stream_name $(tag)
Currently, in this case, when my primary variable (optional-var
) doesn't exist, Fluent Bit literally creates a log stream named $(tag)
, instead of substituting in the Fluent Bit tag.
Edge Case
Quick note on the obvious edge case here -- i.e. if the variable mentioned in the fallback/default log stream name also fails.
At this point, given that the user has now had two chances to choose a variable that will always exist, I think it's reasonable to just throw a hard error and fail, rather than attempting to automatically recover somehow (or, god forbid, creating a new default_default_log_stream_name
option), but that's just my suggestion.
When I add a systemd input to my fluent bit config using the amazon/aws-for-fluent-bit:1.2.0
docker image inside a kubernetes cluster I get the following error:
Input plugin 'systemd' cannot be loaded
Here's a configuration file that reproduces the problem:
[SERVICE]
Flush 1
Log_Level info
Parsers_File parsers.conf
[INPUT]
Name systemd
Tag host.*
Systemd_Filter _SYSTEMD_UNIT=docker.service
Path /var/log/journal
[OUTPUT]
Name stdout
Match *
In order to fix the problem I rebuilt the docker image using this docker file and everything seems to be working fine now:
FROM golang:1.12 as go-build
RUN go get github.com/aws/amazon-kinesis-firehose-for-fluent-bit
WORKDIR /go/src/github.com/aws/amazon-kinesis-firehose-for-fluent-bit
RUN make release
RUN go get github.com/aws/amazon-cloudwatch-logs-for-fluent-bit
WORKDIR /go/src/github.com/aws/amazon-cloudwatch-logs-for-fluent-bit
RUN make release
FROM fluent/fluent-bit:1.2.2
COPY --from=go-build /go/src/github.com/aws/amazon-kinesis-firehose-for-fluent-bit/bin/firehose.so /fluent-bit/firehose.so
COPY --from=go-build /go/src/github.com/aws/amazon-cloudwatch-logs-for-fluent-bit/bin/cloudwatch.so /fluent-bit/cloudwatch.so
# Optional Metrics endpoint
EXPOSE 2020
# Entry point
CMD ["/fluent-bit/bin/fluent-bit", "-e", "/fluent-bit/firehose.so", "-e", "/fluent-bit/cloudwatch.so", "-c", "/fluent-bit/etc/fluent-bit.conf"]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.