Coder Social home page Coder Social logo

Comments (8)

emptywee avatar emptywee commented on August 19, 2024 1

Same here for us.

image

Happening on the nodes where apiservers, schedulers and controllers are running (master nodes). Doesn't happen on the worker nodes here.

Running v1.6 here.

from fluentd-kubernetes-sumologic.

nhoughto avatar nhoughto commented on August 19, 2024 1

We gave up on this, looks like fluentd isn't a great fit for doing this without taking a huge amount of memory, we wrote this instead:

https://github.com/bsycorp/log-forwarder

Does basically the same job as this project, but is written in golang and is way more efficient.

from fluentd-kubernetes-sumologic.

frankreno avatar frankreno commented on August 19, 2024

@nhoughto @emptywee : Can you provide me with the following to help me do some investigation?

  1. What version of k8s?
  2. Settings for Daemonset (Default or list what was customized)
  3. Do you have the Audit log enabled in k8s?
  4. Approximately how many containers per node?
  5. Approximately how much log volume?

from fluentd-kubernetes-sumologic.

nhoughto avatar nhoughto commented on August 19, 2024
  1. v1.7.2
  2. Default, copy and paste from rbac exmple
  3. No, i don't think so? How do i check? Standard KOPS build
  4. 120 containers across 3 nodes (1 master) also happened when scaled out with 200 container across 5 nodes.
  5. Log volume for the HTTP endpoint for the days in question is 1 to 3 million records.

The behaviour i'm seeing is a fast increase in a ruby processes memory usage, if i delete the daemonset and re-apply it the nodes that have high memory usage startup and immediately start ramping memory usage (it doesn't take hours etc to happen). Feels like fluentd (i assume the ruby process?) is scanning files it needs to process, and as it finds more and more files to process up goes its memory usage.

from fluentd-kubernetes-sumologic.

frankreno avatar frankreno commented on August 19, 2024

@nhoughto by default READ_FROM_HEAD is set to true. So when you restart we read everything in again. As a test, can you try altering the Daemonset and set READ_FROM_HEAD to false? Curious if the behavior changes.

kops does not enable audit by default, but you can. Its quite noisy so was more curious than anything. https://github.com/kubernetes/kops/blob/master/docs/cluster_spec.md#audit-logging

from fluentd-kubernetes-sumologic.

nhoughto avatar nhoughto commented on August 19, 2024

oh READ_FROM_HEAD would explain other behaviour i was seeing that when re-applying the DS it seemed to re-parse and upload all of the things rather than just the latest. I guess that is the safest setting, but totally not what we want, we would have had quite significant existing log files it would have been reading.

I'll try that. Although i'd assume by design you would want your process to always work within a given memory envelope even if reading mountains of logs? So READ_FROM_HEAD might workaround the issue but isn't the underlying cause?

from fluentd-kubernetes-sumologic.

frankreno avatar frankreno commented on August 19, 2024

So I have just learned of a bug in the documentation. The README indicates that the FLUSH_INTERVAL is set to 5s. This is not true. It is actually set to 30 seconds by default in the Docerfile. This means that we are not flushing the logs to sumo as fast and the messages are being stored in the buffer. Those buffers could quickly grow (depending on how much data we are reading) and increase the memory usage. For the folks on this issue who are seeing high memory usage, can you please try setting the FLUSH_INTERVAL to 5s. Don't forget to set READ_FROM_HEAD to false if you want to avoid reading from the beginning of the file after making this change and bouncing the pods.

from fluentd-kubernetes-sumologic.

frankreno avatar frankreno commented on August 19, 2024

I am going to close this issue due to inactivity from above comments related to tweaking current settings. We have seen the adjustment of these settings have a large impact on performance. The default values may need tweaks. If anyone disagrees, please feel free to re-open.

from fluentd-kubernetes-sumologic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.