Coder Social home page Coder Social logo

fluent-plugin-s3's Introduction

Amazon S3 plugin for Fluentd

Build Status

Overview

s3 output plugin buffers event logs in local file and upload it to S3 periodically.

This plugin splits files exactly by using the time of event logs (not the time when the logs are received). For example, a log '2011-01-02 message B' is reached, and then another log '2011-01-03 message B' is reached in this order, the former one is stored in "20110102.gz" file, and latter one in "20110103.gz" file.

s3 input plugin reads data from S3 periodically. This plugin uses SQS queue on the region same as S3 bucket. We must setup SQS queue and S3 event notification before use this plugin.

⚠️ Be sure to keep a close eye on S3 costs, as a few user have reported unexpectedly high costs.

Requirements

fluent-plugin-s3 fluentd ruby
>= 1.0.0 >= v0.14.0 >= 2.1
< 1.0.0 >= v0.12.0 >= 1.9

Installation

Simply use RubyGems:

# install latest version
$ gem install fluent-plugin-s3 --no-document # for fluentd v1.0 or later
# If you need to install specifiv version, use -v option
$ gem install fluent-plugin-s3 -v 1.3.0 --no-document
# For v0.12. This is for old v0.12 users. Don't use v0.12 for new deployment
$ gem install fluent-plugin-s3 -v "~> 0.8" --no-document # for fluentd v0.12

Configuration: credentials

Both S3 input/output plugin provide several credential methods for authentication/authorization.

See Configuration: credentials about details.

Output Plugin

See Configuration: Output about details.

Input Plugin

See Configuration: Input about details.

Tips and How to

Migration guide

See Migration guide from v0.12 about details.

Website, license, et. al.

Web site http://fluentd.org/
Documents http://docs.fluentd.org/
Source repository http://github.com/fluent/fluent-plugin-s3
Discussion http://groups.google.com/group/fluentd
Author Sadayuki Furuhashi
Copyright (c) 2011 FURUHASHI Sadayuki
License Apache License, Version 2.0

fluent-plugin-s3's People

Contributors

ashie avatar atkinsj avatar baconmania avatar cmarodrigues avatar cosmo0920 avatar daipom avatar dependabot[bot] avatar duck8823 avatar frsyuki avatar funtusov avatar ganmacs avatar gregsymons avatar hotchpotch avatar ijin avatar jensraaby avatar jordo1138 avatar kanga333 avatar kenhys avatar kzk avatar mumoshu avatar naoya avatar okkez avatar petergillardmoss avatar repeatedly avatar riquemon avatar skizot722 avatar sonots avatar takus avatar yannicktekulve avatar zohararad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fluent-plugin-s3's Issues

Pluggable compresson for user specific algorithm

Base class is below:

class Compressor
  include Configurable
  def configure(conf)
  end

  def ext
  end

  def mime_type
  end

  def compress(chunk, tmp)
  end
end

Use Fluent::Registry to register new algorithm with fluent/plugin/s3_compress prefix.

What are the minimum permissions on S3 policy?

I want to attach the most restrictive policy possible to the key used by the plugin. I have this fairly restrictive policy in place, and it seems to work.

Am I missing any actions that I should allow?

Are there any actions I can remove from the list below?

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "s3:AbortMultipartUpload", "s3:CreateBucket", "s3:ListAllMyBuckets", "s3:ListBucket", "s3:ListBucketMultipartUploads", "s3:ListBucketVersions", "s3:ListMultipartUploadParts", "s3:PutObject" ], "Sid": "Stmt1XXXXXXXXXXXXX", "Resource": [ "arn:aws:s3:::myfluentdbucket" ], "Effect": "Allow" } ] }

s3_object_key_format variable assigments

Hi,

want to create logs like
path/current_date/current_time.gz with every hour file creation

we have time_slice, but what I need to set on current_date ?
s3_object_key_format %{path}/%current_date/%{time_slice}.%{file_extension}

time_slice_format %Y%m%d-%H

please help

Config interpolation no longer working

Hi

Since upgrading to 0.6.0 I've had to update my config files because the interpolation has suddenly stopped working.

I used to have this:

aws_key_id "#{ENV['S3_KEY']}"
aws_sec_key "#{ENV['S3_SECRET']}"

These were both set in the /etc/environment file and were working perfectly with version 0.5.11. I only upgraded to 0.6.0 and restarted my td-agent service, and then got this log trace:

2015-10-13 10:40:24 +0000 [error]: unexpected error error_class=RuntimeError error=#<RuntimeError: can't call S3 API. Please check your aws_key_id / aws_sec_key or s3_region configuration. error = #<Aws::Errors::MissingCredentialsError: unable to sign request without credentials set>>
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.6.0/lib/fluent/plugin/out_s3.rb:193:in `rescue in check_apikeys'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.6.0/lib/fluent/plugin/out_s3.rb:189:in `check_apikeys'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.6.0/lib/fluent/plugin/out_s3.rb:112:in `start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/plugin/out_copy.rb:50:in `block in start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/plugin/out_copy.rb:49:in `each'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/plugin/out_copy.rb:49:in `start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/agent.rb:67:in `block in start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/agent.rb:66:in `each'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/agent.rb:66:in `start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/root_agent.rb:104:in `start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/engine.rb:201:in `start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/engine.rb:151:in `run'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:539:in `run_engine'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:143:in `block in start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:306:in `call'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:306:in `main_process'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:281:in `block in supervise'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:280:in `fork'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:280:in `supervise'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/supervisor.rb:137:in `start'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/lib/fluent/command/fluentd.rb:167:in `<top (required)>'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:73:in `require'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:73:in `require'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.12.12/bin/fluentd:6:in `<top (required)>'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/bin/fluentd:23:in `load'
  2015-10-13 10:40:24 +0000 [error]: /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
  2015-10-13 10:40:24 +0000 [error]: /usr/sbin/td-agent:7:in `load'
  2015-10-13 10:40:24 +0000 [error]: /usr/sbin/td-agent:7:in `<main>'
2015-10-13 10:40:24 +0000 [info]: shutting down fluentd

It's not a huge issue but unexpected.

Thanks,
Ed.

'auto_create_bucket' does not seem to work.

In my Linux server, I installed td-agent 1.1.21 and fluent-plugin-s3-0.4.1 which is installed with td-agent.
My td-agent.conf is like below:

<match **>
  type s3
  s3_bucket foobar # this bucket does not exist.
  s3_endpoint s3-ap-northeast-1.amazonaws.com

  path ${tag}/
  time_slice_format %Y%m%d/%H/%M/
  time_slice_wait 1m
  s3_object_key_format %{path}%{time_slice}${hostname}.%{index}.%{file_extension}

  buffer_path /var/log/td-agent/buffer/${tag}
</match>

and I expected the S3 bucket to be created because auto_create_bucket is set default to true. But td-agent raised an error and td-agent.log show below errors:

2014-12-05 02:18:40 +0000 [error]: unexpected error error_class=RuntimeError error=#<RuntimeError: aws_key_id or aws_sec_key is invalid. Please check your configuration>
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.4.1/lib/fluent/plugin/out_s3.rb:190:in `rescue in check_apikeys'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.4.1/lib/fluent/plugin/out_s3.rb:188:in `check_apikeys'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.4.1/lib/fluent/plugin/out_s3.rb:111:in `start'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/match.rb:40:in `start'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/engine.rb:263:in `block in start'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/engine.rb:262:in `each'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/engine.rb:262:in `start'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/engine.rb:213:in `run'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:464:in `run_engine'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:135:in `block in start'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:250:in `call'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:250:in `main_process'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:225:in `block in supervise'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:224:in `fork'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:224:in `supervise'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:128:in `start'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/lib/fluent/command/fluentd.rb:164:in `<top (required)>'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/1.9.1/rubygems/custom_require.rb:55:in `require'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluentd-0.10.55/bin/fluentd:6:in `<top (required)>'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/bin/fluentd:23:in `load'
  2014-12-05 02:18:40 +0000 [error]: /usr/lib64/fluent/ruby/bin/fluentd:23:in `<top (required)>'
  2014-12-05 02:18:40 +0000 [error]: /usr/sbin/td-agent:7:in `load'
  2014-12-05 02:18:40 +0000 [error]: /usr/sbin/td-agent:7:in `<main>'

I'm runnning this server on EC2 and setting S3Admin IAM Role so apikeys are correct.
I read out_s3.rb and checked aws-sdk, and it seems that bucket.empty? in check_apikeys raises AWS::S3::Errors::NoSuchBucket when the specified bucket does not exist. Therefore, auto_create_bucket configuration seems to never be used.

I'll submit Pull requests, but before writing batch, please check if my suspicion is correct.

$ rpm -qa | grep td-agent
td-agent-1.1.21-0.x86_64

Optional zero padding for index field

I'm running into problems with an installation which is using a daily time_slice_format and uploading a flush_interval of 5 minutes.

The problem is due to the only sorting available with GET Bucket (List Objects) being alphabetical.

http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html

I'd very much like to use the marker feature when listing the objects in the bucket, but (for example) if the last file I downloaded was 20140925_11.gz, using this as a marker would mean files 12-109 (uploaded in the meantime) would be missed.

An option like "zero_padding_length 4" would be really useful in this example.

Order returned when listing bucket;

[mod date] filename
[24/09/14 23:04:56] 20140925_0.gz
[24/09/14 23:09:50] 20140925_1.gz
[24/09/14 23:54:58] 20140925_10.gz
[25/09/14 07:26:30] 20140925_100.gz
[25/09/14 07:31:32] 20140925_101.gz
....
[25/09/14 08:06:39] 20140925_108.gz
[25/09/14 08:11:39] 20140925_109.gz
[24/09/14 23:59:59] 20140925_11.gz
[25/09/14 08:16:41] 20140925_110.gz

Documented feature acl not supported

There are two issues.
First, acl is documented in the Configuration setting, but it doesn't work. You aren't adding the config_param acl to put_options

        put_options[:ssekms_key_id] = @ssekms_key_id if @ssekms_key_id
        put_options[:acl] = @acl if @acl
        @bucket.object(s3path).put(put_options)

Second, the documentation is wrong in README.md for acl. You are using '_' when it should be '-'
Current:
bucket_owner_read
Should be:
bucket-owner-read
See: http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl

Not flush some of the buffer files when using with in_multiprocess plugin

I'm using fulent-plugin-s3 with in_multiprocess and my aws instance have 4 core.
One of child process works fine but rest of 4 fluent-plugin-s3 not flush buffers.
Although I am not familiar with this plugin and I dont know if its not only fluent-plugin-s3 issue but in_multiprocess,
Its only happen to fluent-plugin-s3. Other plugin like out_mongo works fine.

This is my environment and config files

Environment

OS : Amazon Linux AMI release 2014.09
Instance type : m3.xlarge
Core : 4

td-agent : 1.1.21

Other Plugins
Input Plugins

in_multiprocess
in_forward

Output Plugins

fluent-plugin-forest
out_mongo

Config Files
td-agent.conf
<source>
        type multiprocess

        <process>
                cmdline -c /etc/td-agent/td-agent-child-1.conf
        </process>
        <process>
                cmdline -c /etc/td-agent/td-agent-child-2.conf
        </process>
        <process>
                cmdline -c /etc/td-agent/td-agent-child-3.conf
        </process>
        <process>
                cmdline -c /etc/td-agent/td-agent-child-4.conf
        </process>
</source>
td-agent-child-1.conf
# input forward
<source>
        type forward
        port 24224
        bind 0.0.0.0
</source>
td-agent-child-2.conf
# input forward
<source>
        type forward
        port 24225
        bind 0.0.0.0
</source>
td-agent-child-3.conf
# input forward
<source>
        type forward
        port 24226
        bind 0.0.0.0
</source>

include /etc/td-agent/td-agent-common.conf
td-agent-child-4.conf
# input forward
<source>
        type forward
        port 24227
        bind 0.0.0.0
</source>

include /etc/td-agent/td-agent-common.conf

td-agent-common.conf

<match aaa.bbb.**>
        type copy
        <store>
                type mongo
                buffer_type file
                buffer_path /var/log/td-agent/buffer/
                buffer_chunk_limit 256m
                buffer_queue_limit 128
                flush_interval 60s
                retry_limit 20
                retry_wait 1s

                host *******
                port *******
                database *******
                tag_mapped
                remove_tag_prefix aaa.bbb.
                time_key _time
        </store>
        <store>
                type forest
                subtype s3
                <template>
                        aws_key_id *******
                        aws_sec_key *******
                        s3_bucket *******
                        s3_region *******
                        s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
                        path app_log/${tag_parts[2]}/
                        time_slice_format %Y-%m-%d-%H
                        buffer_path /var/log/td-agent/buffer/s3/${tag_parts[2]}/
                </template>
        </store>

</match>

Thanks for helping!

MissingRegionError when using assume_role_credentials

When using assume_role_credentials, even though s3_region was defined in the configuration file, I got this error:

2015-11-05 01:50:19 +0000 [error]: fluent/engine.rb:172:rescue in run: unexpected error error_class=Aws::Errors::MissingRegionError error=#<Aws::Errors::MissingRegionError: missing region; use :region option or export region name to ENV['AWS_REGION']>

This was after upgrading to latest gem and it also picked up the latest aws sdk.

After editing:
/opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.6.1/lib/fluent/plugin/out_s3.rb

And adding the :region option, the error was resolved.

s3 input plugin?

Is there any intention to also consume the logs that that this writes as an input plugin?

Add bz2 storage option.

We currently store our logs bz2 compressed, I was hoping to be able to do the same when we switch to fluentd.

Thanks

Add xz / LZMA2 compression

xz has higher compression ratio than gzip and is also chosen by linux kernel team for distributing source code. It's suitable for archive purpose.

The "xz" spec is located at http://tukaani.org/xz/xz-file-format.txt.

In terms of compression tool, xz and p7zip are available to most platforms. p7zip compresses faster because it can make use of multiple cores.

fluentから、Amazon S3への書き込み方法について

初めまして、CkRealこと本間と申します。
現在、fluentを試験的に使っています。

fluentからAmazon S3への書き込みテストを行った際にエラーを発生したので、
どなたか心当たりがある方がいれば、ご教示いただけないでしょうか。

転送先サーバから転送元サーバへのローカルファイル転送までは確認できています。

以下、試験環境とスタックトレースになります。

■試験環境
AWS上で、以下のサーバ構成でテストを行なっています。
転送元サーバのアクセスログを転送先サーバに送信した後、
Amazon S3に書きこもうとしています。

【転送元サーバ】
OS:CentOS5.6
Apache:2.2.3
fluent:0.9.18

□設定ファイル
[source]
type tail
format apache
path /var/log/httpd/【アクセスログ名】
tag apache.access

[match apache.access]
type tcp
host 【ホスト名】
port 24224
buffer_type file
buffer_path /tmp/fluent_buf_myforward
flush_interval 10s

【転送先サーバ】
OS:CentOS6.0
fluent:0.9.18
fluent-plugin-s3:0.1.1

□設定ファイル
[source]
type tcp
port 24224
bind 0.0.0.0
[match apache.access]
type s3
aws_key_id 【AWSのアクセスキー ID】
aws_sec_key 【AWSのシークレットアクセスキー】
s3_bucket 【バケット名】
buffer_path /tmp/hoge
path /fluent/fluent.log

■スタックトレース
2011-10-06 17:53:53 +0900: fluent/engine.rb:119:rescue in emit_stream: emit transaction faild error="undefined method iso8601' for 2011-10-06 17:53:51 +0900:Time" 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.1.1/lib/fluent/plugin/out_s3.rb:46:inblock in configure'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.1.1/lib/fluent/plugin/out_s3.rb:64:in call' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.1.1/lib/fluent/plugin/out_s3.rb:64:informat'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/output.rb:407:in block in emit' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/event.rb:81:ineach'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/event.rb:81:in each' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/output.rb:398:inemit'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/match.rb:33:in emit' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/engine.rb:117:inemit_stream'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/engine.rb:106:in emit_array' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:89:inon_message'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:119:in each' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:119:inon_read'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/io.rb:108:in on_readable' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/io.rb:170:inon_readable'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/loop.rb:96:in run_once' 2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/loop.rb:96:inrun'
2011-10-06 17:53:53 +0900: fluent/engine.rb:120:rescue in emit_stream: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:44:in run' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:46:rescue in run: unexpected error error="undefined methodiso8601' for 2011-10-06 17:53:51 +0900:Time"
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.1.1/lib/fluent/plugin/out_s3.rb:46:in block in configure' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.1.1/lib/fluent/plugin/out_s3.rb:64:incall'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-plugin-s3-0.1.1/lib/fluent/plugin/out_s3.rb:64:in format' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/output.rb:407:inblock in emit'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/event.rb:81:in each' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/event.rb:81:ineach'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/output.rb:398:in emit' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/match.rb:33:inemit'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/engine.rb:117:in emit_stream' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/engine.rb:106:inemit_array'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:89:in on_message' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:119:ineach'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:119:in on_read' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/io.rb:108:inon_readable'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/io.rb:170:in on_readable' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/loop.rb:96:inrun_once'
2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/cool.io-1.0.0/lib/cool.io/loop.rb:96:in run' 2011-10-06 17:53:53 +0900: plugin/in_stream.rb:47:rescue in run: /usr/local/lib/fluent/ruby/lib/ruby/gems/1.9.1/gems/fluent-0.9.18/lib/fluent/plugin/in_stream.rb:44:inrun'

ArgumentError: invalid configuration option `:s3_server_side_encryption'

Hello, we are getting this error in our td-agent.log

2015-10-21 23:35:28 +0000 [error]: unexpected error error_class=ArgumentError error=#<ArgumentError: invalid configuration option `:s3_server_side_encryption'>

The relevant line of config is...

<store>
  use_server_side_encryption aes256
</store>

Any advice / guidance / fixes would be helpful

Add format_csv

A lot of AWS tools (EMR, RedShift) work really well with CVS files stored in S3. Would it make sense for me or my colleague @dterror to open a pull request adding support for CSV in addition to json?

Other 's3' plugin already use same buffer_path

Using the copy output plugin, I get this error [error]: config error file="/etc/td-agent/td-agent.conf" error="Other 's3' plugin already use same buffer_path: type = s3, buffer_path = /tmp/fluent-plugin-s3"

Here's my config.

<source>
  type tail
  path /var/log/nginx/web__error.log
  pos_file /var/tmp/nginx_web__error.pos
  tag web__error
  format /^(?<time>[^ ]+ [^ ]+) \[(?<log_level>.*)\] (?<pid>\d*).(?<tid>[^:]*): (?<message>.*)$/
</source>
<match web__error>
  type copy
  <store>
    type s3
    aws_key_id ACC_KEY
    aws_sec_key SEC_KEY
    s3_bucket log-bucket
    path web__error/
    buffer_path /tmp/fluent-plugin-s3
    s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
    time_slice_format %Y-%m-%d/%H
    flush_interval 15s
    utc
  </store>
  <store>
    type elasticsearch
    logstash_format true
    logstash_prefix web__error
    flush_interval 15s
    include_tag_key true
    utc_index true
  </store>
</match>

Not sure if this is an issue with this plugin or fluentd (or if I should be posting this here either way), but I can't use either without the capability of sending logs to multiple locations, S3 being the primary location.

REQ: Object expiry

I would like to be able to set an expiry time / TTL on the files uploaded to S3.

This should be a straightforward passthrough of a config like is done for reduced_redundancy.

Thanks.

Check if logs are sent to s3

Hi,

Please can some one tell me how to check if fluentd is really sending logs to s3? Is this done once fluentd receives input logs like for ES and file type?
What is the interval for fluentd to send log to s3? At minight? Instantly?

Here is my configuration :

    <match nginx.api1.mydomain.error.*>
        type copy   
        <store>
        type file
        path /var/log/td-agent/nginx-api1.mydomain-error.log
        </store>
        <store>
        type elasticsearch
        hosts es_ip_address
        logstash_format true
        include_tag_key true
        tag_key _key
        flush_interval 10s
        </store>
        <store>
        type s3
        s3_bucket s3.yyy.extra.websites.eu-west-1.xxx
        s3_region eu-west-1
        s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
        path logs/
        buffer_path /var/log/td-agent/nginx-api1.mydomain-error-s3.log 
        time_slice_format %Y%m%d-%H
        time_slice_wait 10m
        utc
        </store>
    </match>

Thanks

fluentd fails to start because "uninitialized constant Fluent::S3Output::AWS"

Installing fluentd using Chef on AWS OpsWorks using gh://dmytro/fluentd-cookbook

I get the following error on startup:

2014-10-16 02:01:38 +0000 [info]: fluent/engine.rb:107:block in configure: adding source type="tail"
2014-10-16 02:01:38 +0000 [info]: fluent/engine.rb:124:block in configure: adding match pattern="nginx.*" type="s3"
2014-10-16 02:01:38 +0000 [trace]: fluent/plugin.rb:72:register_impl: registered output plugin 's3'
2014-10-16 02:01:38 +0000 [info]: fluent/engine.rb:124:block in configure: adding match pattern="nginx.access" type="elasticsearch"
2014-10-16 02:01:38 +0000 [trace]: fluent/plugin.rb:72:register_impl: registered output plugin 'elasticsearch'
2014-10-16 02:01:38 +0000 [error]: fluent/engine.rb:234:rescue in run: unexpected error error_class=NameError error=#<NameError: uninitialized constant Fluent::S3Output::AWS>
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.4.0/lib/fluent/plugin/out_s3.rb:106:in `start'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/match.rb:40:in `start'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/engine.rb:263:in `block in start'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/engine.rb:262:in `each'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/engine.rb:262:in `start'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/engine.rb:213:in `run'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:464:in `run_engine'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:135:in `block in start'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:250:in `call'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:250:in `main_process'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:225:in `block in supervise'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:224:in `fork'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:224:in `supervise'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/supervisor.rb:128:in `start'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/lib/fluent/command/fluentd.rb:160:in `<top (required)>'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/site_ruby/2.1.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/lib/ruby/gems/2.1.0/gems/fluentd-0.10.53/bin/fluentd:6:in `<top (required)>'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/bin/fluentd:23:in `load'
  2014-10-16 02:01:38 +0000 [error]: fluent/supervisor.rb:135:block in start: /usr/local/bin/fluentd:23:in `<main>'
2014-10-16 02:01:38 +0000 [info]: fluent/engine.rb:237:run: shutting down fluentd
2014-10-16 02:01:38 +0000 [info]: fluent/supervisor.rb:240:supervise: process finished code=0
2014-10-16 02:01:38 +0000 [warn]: fluent/supervisor.rb:243:supervise: process died within 1 second. exit.

It appears there is an issue loading the AWS gem - looking at other issues they often recommend: gem 'aws-s3', :require => 'aws/s3'

Gems

gem list

*** LOCAL GEMS ***

aws-sdk (2.0.2.pre)
aws-sdk-core (2.0.2)
aws-sdk-resources (2.0.2.pre)
bigdecimal (1.2.4)
builder (3.2.2)
bundler (1.5.1)
cool.io (1.2.4)
elasticsearch (1.0.5)
elasticsearch-api (1.0.5)
elasticsearch-transport (1.0.5)
faraday (0.9.0)
fluent-mixin-config-placeholders (0.2.4)
fluent-plugin-elasticsearch (0.5.1)
fluent-plugin-s3 (0.4.0)
fluentd (0.10.53)
http_parser.rb (0.6.0)
io-console (0.4.2)
jamespath (0.5.1)
json (1.8.1)
kgio (2.9.2)
minitest (4.7.5)
msgpack (0.5.9)
multi_json (1.10.1)
multi_xml (0.5.5)
multipart-post (2.0.0)
patron (0.4.18)
psych (2.0.5)
rack (1.6.0.beta)
raindrops (0.13.0)
rake (10.1.0)
rdoc (4.1.0)
sigdump (0.2.2)
test-unit (2.1.2.0)
unicorn (4.7.0)
uuidtools (2.1.5)
yajl-ruby (1.2.1)

Logs are never shipped to s3? version 0.5.6

Logs are never shipped to s3? Write never seems to be called? Any pointers whats wrong?

fluent-gem list | grep s3
fluent-plugin-s3 (0.5.6)

fluent configuration

<source>
  type tail
  path /var/lib/docker/containers/*/*-json.log
  pos_file /var/log/fluentd-docker.pos
  #time_format %Y-%m-%dT%H:%M:%S 
  tag docker.*
  format json
  refresh_interval 1
  log_level debug
</source>

<match docker.var.lib.docker.containers.*.*.log>
  type record_reformer
  container_id ${tag_parts[5]}
  tag docker.all
  log_level debug
</match>

<match docker.all>
  type file
  path /var/log/docker/*.log
  format json
  include_time_key true
  log_level debug
</match>

<match docker.all>
  type s3
  s3_bucket photobox-services-logs
  s3_region eu-west-1
  path logs
  buffer_type memory
  #buffer_path /var/log/fluent/s3
  s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
  time_slice_format %Y%m%d-%H
  auto_create_bucket true
  check_apikey_on_start true
  format out_file
  time_slice_format %Y%m%d-%H
  time_slice_wait 1m
  utc
  log_level debug
</match>

logs are being initialised

2015-03-02 16:43:57 +0000 [info]: fluent/engine.rb:124:block in configure: adding match pattern="docker.all" type="s3"
2015-03-02 16:43:57 +0000 [trace]: fluent/plugin.rb:72:register_impl: registered output plugin 's3'
2015-03-02 16:43:57 +0000 [info]: plugin/in_tail.rb:475:initialize: following tail of /var/lib/docker/containers/3907f6c79687f6ceafcebbeecf4c35d1e4866aebd9019671d9dc373a4bf74a7a/3907f6c79687f6ceafcebbeecf4c35d1e4866aebd9019671d9dc373a4bf74a7a-json.log
2015-03-02 16:43:57 +0000 [info]: plugin/in_tail.rb:475:initialize: following tail of /var/lib/docker/containers/8c4149cd7aa731155700026a5f6db0b9f26b170b766a67795f8a1a1c93d4ba21/8c4149cd7aa731155700026a5f6db0b9f26b170b766a67795f8a1a1c93d4ba21-json.log

Permissions correct on IAM role.

 aws s3 ls photobox-services-logs
                           PRE elb/

docker.all has logs written

 tail -n1 /var/log/docker/19700101.b51050ecb0d011c7c.log 
{"log":"[raft]16:44:28.648224 setCommitIndex.set.result index: 1120514, entries index: 273\n","stream":"stdout","container_id":"c76e5773b44592d08d883a74070934b5e4bab34b286e8f3faf7ee93bd74698b7","log_level":"debug","time":"1970-01-01T00:33:35+00:00"}

tail -n1 /var/lib/docker/containers/*/*-json.log | head -n10
==> /var/lib/docker/containers/379cc31c1efd3b4df5c8b2df7cfc1a6f13c0aa5e58031b00948b44ce89ce6175/379cc31c1efd3b4df5c8b2df7cfc1a6f13c0aa5e58031b00948b44ce89ce6175-json.log <==
{"log":"\n","stream":"stdout","time":"2015-03-02T16:47:40.809578959Z"}
...

Add support for ${} substitution

We have set up all of our plugins to use environment variables for configuration items. The s3 plugin does not support this syntax:

aws_api_key #{ENV['AWS_API_KEY']}

Flushing the buffer fails with AWS::S3::Errors::Forbidden

Whenever there already exists a file in the s3, fluentd fails to flush the buffer with the following error message.

2015-08-06 13:39:01 +0900 [warn]: temporarily failed to flush the buffer. next_retry=2015-08-06-13:39:17 +0900 error_class="AWS::S3::Errors::Forbidden" error="AWS::S3::Errors::Forbidden" plugin_id="object:3fa6da00f774"
2015-08-06 13:39:01 +0900 [warn]: suppressed same stacktrace

However, if I delete the already existing file, then the flush works fine. I am guessing that the plugin doesn't correctly set the index of a new file. How can I fix this problem? Below is the snippet of configuration file of fluentd.

<match rails>
  type copy
  <store>
      type s3
      aws_key_id AWS_KEY_ID
      aws_sec_key AWS_SEC_KEY
      s3_bucket S3_BUCKET
      s3_region ap-northeast-1
     path logs/
     buffer_path /var/log/td-agent/buffer/s3
     s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
     time_slice_format %Y-%m-%d/%H
     flush_interval 1s
  </store>
</match>

Split S3 files by date

Hi

Is it possible to split S3 logfiles by date?

I want each S3 logfile to only contain events that occur during a specified time period (e.g. the same day). This would make it very easy to grab all my data for a specific day for processing (or re-processing).

Thanks,
Ed.

S3 output error

I found this error that completely blocked fluent from piping any data into s3. It was still filling its buffers along the way. I found a workaround by creating an empty version of that file and fluent started emptying out its buffers. One emptied correctly, the other only went back 2 days of data instead of 13.

Any help would be appreciated

unexpected error while shutting down output plugins plugin=Fluent::S3Output plugin_id="object:3fa4b1162764" error_class=Errno::ENOENT error=#<Errno::ENOENT: No such file or directory @ sys_fail2 - (/site/fluentd/run/s3.email.2016030501.b52d42c157d319535.log, /site/fluentd/run/s3.email.2016030501.q52d42c157d319535.log)>
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/plugin/buf_file.rb:68:in `rename'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/plugin/buf_file.rb:68:in `mv'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/plugin/buf_file.rb:185:in `enqueue'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/buffer.rb:296:in `block (2 levels) in push'
  2016-03-17 21:19:00 +0000 [warn]: /opt/rbenv/versions/2.1.2/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/buffer.rb:295:in `block in push'
  2016-03-17 21:19:00 +0000 [warn]: /opt/rbenv/versions/2.1.2/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/buffer.rb:289:in `push'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:544:in `block (2 levels) in configure'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:542:in `each'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:542:in `block in configure'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:579:in `call'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:579:in `enqueue_buffer'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:297:in `block in try_flush'
  2016-03-17 21:19:00 +0000 [warn]: /opt/rbenv/versions/2.1.2/lib/ruby/2.1.0/monitor.rb:211:in `mon_synchronize'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:295:in `try_flush'
  2016-03-17 21:19:00 +0000 [warn]: /site/fluentd/vendor/bundle/ruby/2.1.0/gems/fluentd-0.12.20/lib/fluent/output.rb:140:in `run'
2016-03-17 21:19:00 +0000 [info]: shutting down output type="s3" plugin_id="object:3fa4b1511d24"

Authentication method

Hello,

I am using the plugin with a third party storage compatible with S3.

Which authentication method is used with fluent-plugin-s3 ? I am getting errors since my application uses authentication V2.
Is authentication V2 implemented ?
Thx

Support external command compression for gzip

We have a couple of customers hitting CPU saturation. By having gzip compression as an external command, they could use multiple cores efficiently without having multi process plugin.

Amazon S3のendpointの指定について

はじめまして。
1週間前ほどからfluent-plugin-s3に興味をもって検証させていただいているものです。
報告させていただきます。
※ 不慣れなもので、記述場所にあやまりがあれば、消去のほどよろしくお願いします。

導入予定かもしれませんが、
・s3_endpointの指定
についてご確認させていただきます。

こちらは、aws-sdkにて記述を検討が前提なのか、
それとも、下記のように設定ファイルに埋め込むことを予定されていますでしょうか?

バージョン:fluent-plugin-s3 (0.2.1)
対象ファイル:https://github.com/fluent/fluent-plugin-s3/blob/master/lib/fluent/plugin/out_s3.rb

■ end_point差し込みのイメージ



# diff -u /tmp/fluent-plugin-s3/lib/fluent/plugin/out_s3.rb /tmp/fluent-plugin-s3/lib/fluent/plugin/out_s3a.rb
--- /tmp/fluent-plugin-s3/lib/fluent/plugin/out_s3.rb   2011-11-21 15:03:09.474556318 +0900
+++ /tmp/fluent-plugin-s3/lib/fluent/plugin/out_s3a.rb  2011-11-21 15:04:41.110590881 +0900
@@ -18,6 +18,7 @@
   config_param :aws_key_id, :string
   config_param :aws_sec_key, :string
   config_param :s3_bucket, :string
+  config_param :s3_endpoint, :string

   def configure(conf)
     super
@@ -28,6 +29,7 @@
   def start
     super
     @s3 = AWS::S3.new(
+      :s3_endpoint=>@s3_endpoint,
       :access_key_id=>@aws_key_id,
       :secret_access_key=>@aws_sec_key)
     @bucket = @s3.buckets[@s3_bucket]

■ 設定ファイル例


<match pattern>
  type s3

  aws_key_id YOUR_AWS_KEY_ID
  aws_sec_key YOUR_AWS_SECRET/KEY
  s3_bucket YOUR_S3_BUCKET_NAME
  path logs/
  buffer_path /var/log/fluent/s3

  ## Tokyo
  s3_endpoint s3-ap-northeast-1.amazonaws.com

  time_slice_format %Y%m%d-%H
  time_slice_wait 10m
  utc
</match>

ご指摘のほどよろしくお願いします。

Feature request: want names of new S3 files as they are saved

For a fully streaming infrastructure it would be very handy to have a new event generated inside Fluentd whenever a new S3 file is created. I would use this to trigger loading this file into our data warehouse as soon as it was created.

For example, a new directive in the s3 output plugin:

tag s3_file

would send a message to the tag 's3_file' with a 'filename' field that contains the complete s3 path:

{"filename":"s3://bucket/a/b/file-deadbeef.gz"}

The message might also include metadata like time of creation and length of file.

{"filename":"s3://bucket/a/b/file-deadbeef.gz", "timestamp":"2016-01-01 01:01:01", "length": 56737}

Invalid AWS Credentials

I am using td-agent version 2.1.0-0 which automatically uses plugin out-s3 v0.4.1.
I get this strange error where it says my credentials is invalid. I'm sure I've set my profile instance to allow all S3 access. It's still failed. I then set aws_key_id and aws_sec_key, and I've recheck and recheck to make sure it's corrent, but still td-agent can't startup successfully, saying aws_key_id or aws_sec_key is invalid.

Care to help?

Here's the log:

2014-10-25 12:51:34 +0000 [info]: starting fluentd-0.10.55
2014-10-25 12:51:34 +0000 [info]: reading config file path="/etc/td-agent/td-agent.conf"
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-mixin-config-placeholders' version '0.3.0'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-mixin-config-placeholders' version '0.2.4'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-mixin-plaintextformatter' version '0.2.6'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-flume' version '0.1.1'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-mongo' version '0.7.3'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '1.4.1'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-s3' version '0.4.1'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-scribe' version '0.10.12'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-td' version '0.10.22'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-td-monitoring' version '0.1.3'
2014-10-25 12:51:34 +0000 [info]: gem 'fluent-plugin-webhdfs' version '0.3.1'
2014-10-25 12:51:34 +0000 [info]: gem 'fluentd' version '0.10.55'
2014-10-25 12:51:34 +0000 [info]: using configuration file: <ROOT>
  <source>
    type forward
    port 24224
  </source>
  <match tracker.**>
    type s3
    aws_key_id AKIAIXXXXXXXXXXXXXXXX
    aws_sec_key XXXXXXXXXXXXXXXXXXXXXX/XXXXXXXXXxXXXXxxXxXxXxxxXXxxxxXxXxxX
    buffer_chunk_limit 10m
    buffer_path /var/log/fluent/s3
    flush_interval 10m
    path path
    s3_bucket some_bucket
    s3_endpoint s3-ap-southeast-1.amazonaws.com
    store_as txt
    time_slice_format %Y%m%d%H
    utc 
  </match>
</ROOT>
2014-10-25 12:51:34 +0000 [info]: adding source type="forward"
2014-10-25 12:51:34 +0000 [info]: adding match pattern="tracker.**" type="s3"
2014-10-25 12:51:35 +0000 [error]: unexpected error error_class=RuntimeError error=#<RuntimeError: aws_key_id or aws_sec_key is invalid. Please check your configuration>
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.4.1/lib/fluent/plugin/out_s3.rb:190:in `rescue in check_apikeys'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.4.1/lib/fluent/plugin/out_s3.rb:188:in `check_apikeys'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.4.1/lib/fluent/plugin/out_s3.rb:111:in `start'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/match.rb:40:in `start'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/engine.rb:263:in `block in start'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/engine.rb:262:in `each'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/engine.rb:262:in `start'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/engine.rb:213:in `run'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:464:in `run_engine'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:135:in `block in start'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:250:in `call'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:250:in `main_process'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:225:in `block in supervise'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:224:in `fork'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:224:in `supervise'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/supervisor.rb:128:in `start'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/lib/fluent/command/fluentd.rb:164:in `<top (required)>'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/custom_require.rb:36:in `require'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/site_ruby/2.1.0/rubygems/custom_require.rb:36:in `require'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluentd-0.10.55/bin/fluentd:6:in `<top (required)>'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/bin/fluentd:23:in `load'
  2014-10-25 12:51:35 +0000 [error]: /opt/td-agent/embedded/bin/fluentd:23:in `<top (required)>'
  2014-10-25 12:51:35 +0000 [error]: /usr/sbin/td-agent:7:in `load'
  2014-10-25 12:51:35 +0000 [error]: /usr/sbin/td-agent:7:in `<main>'
2014-10-25 12:51:35 +0000 [info]: shutting down fluentd
2014-10-25 12:51:35 +0000 [info]: process finished code=0
2014-10-25 12:51:35 +0000 [error]: fluentd main process died unexpectedly. restarting.

check_apikeys fails when limiting permission to a sub directory

It causes an exception when it is limited to read/write to a certain sub directory. Following permission setting allows us to limit IAM user to read/write under sub_dir/. It is useful to let multiple IAM users write under a same bucket.

{
  "Statement": [
    {
      "Action": ["s3:ListBucket", "s3:GetBucketLocation"],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::bucket_name",
      "Condition": {
        "StringLike": {
          "s3:prefix": "sub_dir/*"
        }
      }
    },
    {
      "Action": [
        "s3:PutObject", "s3:GetObject"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::bucket_name/sub_dir/*"
      ]
    }
  ]
}

From v0.3.0, check_apikeys uses @bucket.empty?. I suppose it lists a top of a bucket, then causes an exception.

  def check_apikeys
    @bucket.empty?
  rescue
    raise "aws_key_id or aws_sec_key is invalid. Please check your configuration"
  end

Removing check_apikeys will solve this issue, and it still raise an error when writing logs. Is it good solution?

Configuration for S3 compatible service

Would you mind checking s3_endpoint configuration for S3 compatible service ?
I got this error message below when I tried to transport log files to RiakCS via fluent-s3-plugin.

015-03-30 15:36:45 +0900 [error]: unexpected error error_class=AWS::S3::Errors::InvalidAccessKeyId error=#<AWS::S3::Errors::InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records.>
2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.63.0/lib/aws/core/client.rb:375:in return_or_raise' 2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.63.0/lib/aws/core/client.rb:476:inclient_request'
2015-03-30 15:36:45 +0900 [error]: (eval):3:in get_bucket_versioning' 2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.63.0/lib/aws/s3/bucket.rb:456:inversioning_state'
2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.63.0/lib/aws/s3/bucket.rb:444:in versioning_enabled?' 2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/aws-sdk-v1-1.63.0/lib/aws/s3/bucket.rb:507:inexists?'
2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.5.6/lib/fluent/plugin/out_s3.rb:139:in ensure_bucket' 2015-03-30 15:36:45 +0900 [error]: /opt/td-agent/embedded/lib/ruby/gems/2.1.0/gems/fluent-plugin-s3-0.5.6/lib/fluent/plugin/out_s3.rb:96:instart'

I think, We need to specify "s3_endpoint" configuration to aws-sdk instead of "endpoint" when we use S3 compatible service.
I changed the line 79 of out_s3.rb below, and then, I solved this error message.

options[:endpoint] = @s3_endpoint if @s3_endpoint

options[:s3_endpoint] = @s3_endpoint if @s3_endpoint

Thanks.

Provide ability to integrate tag or event attributes in s3_object_key.

I am rebuilding an existing log system with a new implementation using fluentd, and for legacy reasons it would be helpful to integrate either the tag (I can build a tag using the rewrite-tag plugin) or event attributes as part of the s3_object_key or path, so I have more control over the name under which the object is stored.

Thanks.

Plugin needlessly requires ListBucketVersions permission

    def check_apikeys
      @bucket.empty?
    rescue AWS::S3::Errors::NoSuchBucket
      # ignore NoSuchBucket Error because ensure_bucket checks it.
    rescue => e
      raise "can't call S3 API. Please check your aws_key_id / aws_sec_key or s3_region configuration. error = #{e.inspect}"
    end

This code attempts to run "bucket.empty?" after initializing an S3 connection, ostensibly to check whether the plugin can successfully access the bucket.

What bucket.empty? actually does is perform a ListBucketVersions call, and returns a boolean depending on whether there were any records in the response. As a result, the plugin effectively requires permissions to run ListBucketVersions, even though this permission has nothing to do with the basic task of uploading messages to S3.

Moreover, there's nothing actually in the documentation (or in the code) to indicate that this permission is required. To discover it, I had to open the plugin source, add debug statements, trace my way through to find the failing line, then check the Ruby SDK source to find out what the method was actually trying to do.

The plugin should be updated to simply print error messages to the log when it fails to upload a file due to permissions. Or, at the very least, the docs should explicitly say that the ListBucketVersions permission is required for the plugin to work at all.

Insert time variable in the s3_object_key_format

Hi,

Please, is there a time variable for s3_object_key_format?
I would to get log in my bucket in folder per year,; month, day_of_month, and a file name containing the time and file extension. This is my configuration :

<match my_tag>
        type s3
        s3_bucket my_bucket
        s3_region eu-west-1
        #s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
        s3_object_key_format %{path}%{time_slice}/stats_log-%{time}_%{index}.%{file_extension}
        path logs/
        buffer_path /var/log/td-agent/buffer_file.log
        time_slice_format %Y/%m/%d
        time_format %H:%M:%S
        time_slice_wait 10m
        utc
</match>

As you can see above, I added forward slashes ("/") between the time silice format field, to separate them in directories. But is like the %{time} variable I put there doesn't exist and as result this field is empty in my log file.
Please how can I achieve this?

Thank you.

Best regards,

Enable it to use by "riak-cs"

It is made to want you to be able to set up the following option so that "riak-cs" can also be used.

use_ssl
proxy_uri

Moreover,

It becomes "aws_key_id or aws_sec_key is invalid. Please check your configuration" by "check_apikeys."

Since it is satisfactory, "aws_sec_key" and "aws_sec_key" have been commented out.

@s3 = AWS::S3.new(options)
@bucket = @s3.buckets[@s3_bucket]

ensure_bucket
#check_apikeys

ps
英語が苦手なので、日本語でも。
要はriak-csにも送れる様にオプション2つ(use_sslとproxy_uri)をconfigで設定できるようにしてほしいのと、「check_apikeys」で値は間違ってないのにエラーになるので、今はコメントアウトして使ってます。

td-agent won't stop if a file for the corresponding time slice has already been uploaded to s3

I'm experiencing a problem when using td-agent 1.1.20 (rhel package on Amazon Linux); didn't try on earlier versions so not sure if that's relevant.

The problem is if there is a file in buffer/ directory for a time slice for which a file has already been uploaded to s3 then I can't stop td-agent anymore - the shutdown handlers time out (even when changing STOPTIMEOUT in /etc/sysconfig/td-agent) and leave nothing useful in logs even with -vv. The only way to stop (restart) td-agent is then to kill -9 it.

2014-07-10 13:39:23 +0000 [debug]: fluent/supervisor.rb:287:block in install_supervisor_signal_handlers: fluentd supervisor process get SIGTERM
2014-07-10 13:39:23 +0000 [debug]: fluent/supervisor.rb:388:block in install_main_process_signal_handlers: fluentd main process get SIGTERM
2014-07-10 13:39:23 +0000 [debug]: fluent/supervisor.rb:391:block in install_main_process_signal_handlers: getting start to shutdown main process
2014-07-10 13:39:23 +0000 [info]: fluent/engine.rb:231:run: shutting down fluentd

I get to this state easily by trying to stop td-agent twice in one hour (time_slice_format %Y/%m/%d/%H).

Also, log_level does not seem to have any effect for s3. I don't see any s3 flushing related log messages in td-agent.log.

dynamic values for path

I am trying to have dynamic values for path based on the tag last name.

path log/${tag_parts.last}/

This doesn't seem to work though. It creates the same folder name ("${tag_parts.last}") in S3. Is it not supposed to work with dynamic values ?

hyphen in bucket name breaks check_apikeys

Ran across this when using the plugin for the first time:

Bucket named foo-bar would cause the following error:

2013-12-06 23:32:25 +0000 [error]: fluent/engine.rb:218:rescue in run: unexpected error error_class=RuntimeError error=#<RuntimeError: aws_key_id or aws_sec_key is invalid. Please check your configuration>

coming from:

def check_apikeys
@bucket.empty?
rescue
raise "aws_key_id or aws_sec_key is invalid. Please check your configuration"
end

removing the hyphen and creating a bucket named foobar worked flawlessly

Experiencing delayed upload to S3 dependent on secure_forward emptying buffers

Hi here,

We are experiencing some unexplained delays with chunks not being pushed to s3 until all of the secure_forward instances have fully flushed their buffers.

The image below depicts the behavior we are seeing. The chart on the left is the count of buffer files in the aggregator's out_s3 buffer directory. The chart on the right is the count of buffer files in the secure_forward buffer directory. As you can see the buffer counts begin to increase in both charts at ~6:00 when our data-generating-process starts. Neither log files contain any information hinting at the cause. The aggregator's out_s3 buffers then begin to fill pretty rapidly, and the frequency of chunk uploads to s3 drops dramatically (~450 uploads/hour to ~100/per hour, even though the record count increased exponentially). This condition persists until the secure_forward instances have flushed all of their buffers (seen at ~8:40 in the chart on the right) after which the out_s3 plugin rapidly flushes all of the chunks it was buffering.

Our configs are below. We appreciate any insight you could give into the problem.

Thanks,
Jeremy

Notes: 1) We are using flush_interval instead of time_slice_wait because we want logs to be delivered to S3 at a regular interval even if the chunks aren't full to ensure timely delivery. 2) The start time of the buffer backup seems to move around a bit (today it started at 3:30), but happens consistently every morning.

selection_018

Log generating servers:

<match **>
  type                secure_forward
  shared_key          key
  self_hostname       ${hostname}
  send_timeout        60s

  buffer_type         file
  buffer_path         /var/spool/td-agent/forward
  buffer_chunk_limit  16m
  buffer_queue_limit  2048
  flush_interval      10s
  retry_wait          20s
  disable_retry_limit true
  flush_at_shutdown  true

  <server>
    host     listener-01.domain.com
    port     <port1>
    standby  no
  </server>
  <server>
    host     listener-01.domain.com
    port     <port2>
    standby  no
  </server>
  ...
</match>

Fluentd listener/secure forward hosts:

<match tagname.**>
  type copy

  # Forward to the aggregator instance
  <store>
    type               secure_forward
    shared_key         key
    self_hostname      ${hostname}

    buffer_type         file
    buffer_path         /var/spool/td-agent/listener-24284.forward.tagname
    buffer_chunk_limit  16m
    buffer_queue_limit  2048
    flush_interval      10s
    retry_wait          20s
    max_retry_wait      30s
    disable_retry_limit true
    flush_at_shutdown   true
    secure              false

    <server>
      host     aggregator-01.domain.com
      port     <agg port>
      standby  no
    </server>
  </store>
</match>

#...other match blocks same similar to above

<match **>
  type copy

  # Forward to the aggregator instance
  <store>
    type               secure_forward
    shared_key         key
    self_hostname      ${hostname}

    # File buffering - 16m * 512 = 8g
    buffer_type         file
    buffer_path         /var/spool/td-agent/listener-24284.forward
    buffer_chunk_limit  16m
    buffer_queue_limit  512
    flush_interval      10s
    retry_wait          20s
    max_retry_wait      30s
    disable_retry_limit true
    flush_at_shutdown   true
    secure              false

    <server>
      host     aggregator-01.domain.com
      port     <agg port>
      standby  no
    </server>
  </store>

  # Forward to elasticsearch
  <store>
    type                elasticsearch
    logstash_format     true
    logstash_prefix     logstash
    include_tag_key     true
    tag_key             tag
    type_name           fluentd
    type_key            _type

    buffer_type         file
    buffer_path         /var/spool/td-agent/listener-24284.elasticsearch
    buffer_chunk_limit  16m
    buffer_queue_limit  512
    flush_interval      5s
    retry_wait          15s
    max_retry_wait      30s
    disable_retry_limit true
    flush_at_shutdown   true

    resurrect_after     0  # default is 60
    utc_index           false
  </store>
</match>

Aggregator/out_s3 conf.
Note: This is using the forest plugin, but we have the same <store> block [via templates] for different dedicated <match>'s and those exhibit the same behavior.

<match **>
  type     forest
  subtype  copy
  <template>

    # Store in s3
    <store>
      @log_level            debug

      type                  s3
      aws_key_id            key
      aws_sec_key           key
      s3_region             region
      s3_bucket             bucketname
      path                  <path>/${tag}/
      s3_object_key_format  %{path}%{year}/%{month}/%{day}/${tag}.%{time_slice}_%{uuid_flush}.gz
      format                json
      include_tag_key       true
      tag_key               @tag
      include_time_key      true
      time_key              @timestamp
      time_format           %Y-%m-%dT%H:%M:%S%z
      time_slice_format     %Y%m%d%H
      flush_interval        5m
      store_as              gzip
      use_ssl               yes
      num_threads           4

      buffer_type           file
      buffer_path           /var/spool/td-agent/s3/${tag}
      buffer_chunk_limit    32m
      buffer_queue_limit    4098
      retry_wait            1s
      max_retry_wait        5s
      disable_retry_limit   true
    </store>

    # Store in a local file
    <store>
      type               file
      path               <path>/${tag}
      append             true
      time_slice_format  %Y%m%d-%H
      time_format        %Y%m%dT%H%M%S%z
      time_slice_wait    10m
      format             json
      include_tag_key    true
      tag_key            @tag
      include_time_key   true
      time_format        %Y-%m-%dT%H:%M:%S%z
      time_key           @timestamp
      buffer_chunk_limit 32m
    </store>
  </template>
</match>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.