Coder Social home page Coder Social logo

elastic-agent-autodiscover's Introduction

elastic-agent-autodiscover

This repo contains packages required by autodiscover.

  • github.com/elastic/elastic-agent-autodiscover/bus
  • github.com/elastic/elastic-agent-autodiscover/docker
  • github.com/elastic/elastic-agent-autodiscover/kubernetes
  • github.com/elastic/elastic-agent-autodiscover/kubernetes/metadata
  • github.com/elastic/elastic-agent-autodiscover/utils

Releasing updates

Note: For every user-facing change remember to update the changelog properly

Every time a new PR is merged and we want to make it available to external repos using this library we need to create a new tag. Anybody with push privileges to this repository can create a new tag locally and push it to the upstream like the following:

$ git remote -v
origin	[email protected]:ChrsMark/elastic-agent-autodiscover.git (fetch)
origin	[email protected]:ChrsMark/elastic-agent-autodiscover.git (push)
upstream	https://github.com/elastic/elastic-agent-autodiscover.git (fetch)
upstream	https://github.com/elastic/elastic-agent-autodiscover.git (push)
$ git tag -a v0.2.1 -m "New patch release for minor codebase improvements"
$ git push upstream v0.2.1 
Enumerating objects: 1, done.
Counting objects: 100% (1/1), done.
Writing objects: 100% (1/1), 190 bytes | 190.00 KiB/s, done.
Total 1 (delta 0), reused 0 (delta 0)
To https://github.com/elastic/elastic-agent-autodiscover.git
 * [new tag]             v0.2.1 -> v0.2.1

Then the tag should be available at https://github.com/elastic/elastic-agent-autodiscover/tags and anyone can use the the new version of the library in other projects. For example in order to use v0.2.1 in Beats projects one would need a go get github.com/elastic/[email protected].

After the tag is available a Release can be created using this tag and the proper content from the changelog.

Development

When one wants to edit and test the library as part of the Beats or Elastic Agent projects, the local version of the dependency can be referenced with the following:

go.mod:

replace github.com/elastic/elastic-agent-autodiscover => /home/user/go/src/github.com/elastic/elastic-agent-autodiscover

This will use the local code rather than the upstream dependency. Note: Do not forget to exclude this change from the final commits.

elastic-agent-autodiscover's People

Contributors

adriansr avatar aleksmaus avatar andrewkroh avatar andrewstucki avatar andrewvc avatar blakerouse avatar chrsmark avatar dedemorton avatar exekias avatar faec avatar fearful-symmetry avatar jsoriano avatar kaiyan-sheng avatar kuisathaverat avatar kvch avatar leehinman avatar marc-gr avatar michalpristas avatar monicasarbu avatar mtojek avatar narph avatar p1llus avatar ph avatar ruflin avatar sayden avatar simitt avatar tsg avatar v1v avatar vjsamuel avatar ycombinator avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elastic-agent-autodiscover's Issues

cache.WaitForCacheSync may never exit on shutdown

This was discovered as the root cause of intermittent failures in the ECK operator integration tests for Fleet. See elastic/cloud-on-k8s#6331 (comment) for the logs we get when this happens. Diagnostics including a goroutine dump are attached.

Contents of the agent's state.yaml when this happens:

components: []
log_level: info
message: context canceled
state: 4

fleet-server-deadlock-diagnostics.tar.gz

The coordinator seems to be stuck at:

1 @ 0x559a9083e256 0x559a9080c1cc 0x559a9080bbf8 0x559a90ffd2e5 0x559a90ff9c05 0x559a91cb4617 0x559a9086ec61
#	0x559a90ffd2e4	github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator.(*Coordinator).runner+0x1084	github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator/coordinator.go:568
#	0x559a90ff9c04	github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator.(*Coordinator).Run+0x164		github.com/elastic/elastic-agent/internal/pkg/agent/application/coordinator/coordinator.go:399
#	0x559a91cb4616	github.com/elastic/elastic-agent/internal/pkg/agent/cmd.run.func6+0x36						github.com/elastic/elastic-agent/internal/pkg/agent/cmd/run.go:220

This is the relevant code block:
https://github.com/elastic/elastic-agent/blob/973af90d85dd81aaccfd42a1f81e7ad60f6780db/internal/pkg/agent/application/coordinator/coordinator.go#L552-L568

Changelog should have latest version first

The Changelog in this repository is ordered in chronological order (oldest at top), but I think it should be reverse chronological order (latest at top).

I think reverse chronological order is how all Changelogs I've seen are structured. Also keepachangelog.com says:

  • The latest version comes first.

Verify autodiscovery node conditions

Context

Node autodiscovery of elastic agent is based on the following block

func DiscoverKubernetesNode(log *logp.Logger, nd *DiscoverKubernetesNodeParams) (string, error) {
ctx := context.TODO()
// Discover node by configuration file (NODE) if set
if nd.ConfigHost != "" {
log.Infof("kubernetes: Using node %s provided in the config", nd.ConfigHost)
return nd.ConfigHost, nil
}
// Discover node by serviceaccount namespace and pod's hostname in case Beats is running in cluster
if nd.IsInCluster {
node, err := discoverInCluster(nd, ctx)
if err == nil {
log.Infof("kubernetes: Node %s discovered by in cluster pod node query", node)
return node, nil
}
log.Debug(err)
}
// try discover node by machine id
node, err := discoverByMachineID(nd, ctx)
if err == nil {
log.Infof("kubernetes: Node %s discovered by machine-id matching", node)
return node, nil
}
log.Debug(err)
// fallback to environment variable NODE_NAME
node = os.Getenv("NODE_NAME")
if node != "" {
log.Infof("kubernetes: Node %s discovered by NODE_NAME environment variable", node)
return node, nil
}
return "", errors.New("kubernetes: Node could not be discovered with any known method. Consider setting env var NODE_NAME")
}

Cases for autodiscovery of node are:

  • Based on configuration provided
  • Discover node by serviceaccount namespace and pod's hostname in case Beats is running in cluster
  • Discover node by machine id
  • Discover environment variable NODE_NAME

During issue https://github.com/elastic/sdh-beats/issues/3107 we have observed that autodiscovery based on machine id was triggered and not the previous cases.

Goals

  • Verify the conditions of autodiscovery and why were not matched in case of issue
  • Check if machine-id should be the failover option and not the variable NODE_NAME
  • Issue several PR fixes if any issues identified

`add_resource_metadata.cronjob` overloads the memory usage

As reported at elastic/beats#33307, the kubernetes autodiscovery provider can lead to OOM kills for Beats Pods in clusters with specific type of workloads, ie Cronjobs.

The purpose of this issue is the following:

  1. Consider making add_resource_metadata.cronjob: false the default since we know it's an "expensive" feature.
  2. Document the nature of this setting and its implications
  3. Consider improving the way we retrieve the objects or get the "owner" name by trimming the suffix of the Object's name ie: hello (the name of the cronjob) out of hello-1234 (the name of the job).

For full context and previous analysis see the summary at elastic/beats#33307 (comment).

cc: @eedugon @gizas @jsoriano

Upgrade `github.com/docker/docker` dependency to `v26.0.2+incompatible`

Taken from elastic/elastic-agent#4615 (comment):

In elastic/elastic-agent#4615, we're trying to upgrade the github.com/elastic/elastic-agent-system-metrics dependencyfrom0.9.2to0.9.3. However, this upgrade is indirectly also upgrading the github.com/docker/dockerdependency fromv25.0.5+incompatibletov26.0.2+incompatible`.

This indirect upgrade is conflicting with the version of the github.com/elastic/elastic-agent-autodiscover dependency being used by github.com/elastic/elastic-agent: v0.6.14 because that version depends on an older version of the github.com/docker/docker dependency: v24.0.9+incompatible.

We need to:

  • upgrade the github.com/docker/docker dependency to v26.0.2+incompatible in this repository
  • make the necessary API adjustments (see elastic/elastic-agent#4615 (comment))
  • cut a new release
  • upgrade the github.com/elastic/elastic-agent-autodiscover dependency in github.com/elastic/elastic-agent to the new release's version

This will unblock elastic/elastic-agent#4615.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.