Coder Social home page Coder Social logo

megaease / easeagent Goto Github PK

View Code? Open in Web Editor NEW
568.0 27.0 114.0 20.1 MB

An agent component for the Java system

License: Apache License 2.0

Java 99.91% Shell 0.07% Dockerfile 0.02%
apm spring-cloud javaagent observability microservices zipkin java servicemesh zipkin-brave metrics

easeagent's People

Contributors

akwei avatar asasas234 avatar buptzouy avatar chihuopub avatar dependabot[bot] avatar haoel avatar hesstina-yui avatar jack47 avatar jackpan123 avatar jeraxxxxxxx avatar jiweiyuan avatar jxd134 avatar jxd1990 avatar landyking avatar lanxenet avatar michaelygzhang avatar observeralone avatar oewang avatar oseenix avatar pengjiejason avatar qdongxu avatar robxyy avatar samutamm avatar snyk-bot avatar zhao-kun avatar zhongl avatar zouyingjie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

easeagent's Issues

replacement of nanohttpd

nanohttpd has not been maintained for several years, so consider replacing it with JDK's built-in HTTP server or another lightweight HTTP server, anyone willing to help with the replacement?

Support config system tag info

Background

In the actual implementation process, customers need project dimension management.

Expect

easeagent support config project.id to set the field system.

build errors in Java 11

1) Java version

$ java -version                                                                                                                                                                                                                        
openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
OpenJDK 64-Bit Server VM AdoptOpenJDK-11.0.11+9 (build 11.0.11+9, mixed mode)

2) git clone the source code

3) run mvn clean package, got the following errors.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running com.megaease.easeagent.log4j2.impl.MDCTest
12月 24, 2021 11:11:05 上午 com.megaease.easeagent.log4j2.LoggerFactory <clinit>
警告: build agent logger factory fail: java.lang.ClassNotFoundException<com.megaease.easeagent.log4j2.impl.LoggerProxyFactory>.
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 0.061 sec <<< FAILURE! - in com.megaease.easeagent.log4j2.impl.MDCTest
remove(com.megaease.easeagent.log4j2.impl.MDCTest)  Time elapsed: 0.056 sec  <<< ERROR!
java.lang.ExceptionInInitializerError: null
	at com.megaease.easeagent.log4j2.MDC.<clinit>(MDC.java:23)
	at com.megaease.easeagent.log4j2.impl.MDCTest.remove(MDCTest.java:40)

get(com.megaease.easeagent.log4j2.impl.MDCTest)  Time elapsed: 0.001 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class com.megaease.easeagent.log4j2.MDC
	at com.megaease.easeagent.log4j2.impl.MDCTest.get(MDCTest.java:48)

put(com.megaease.easeagent.log4j2.impl.MDCTest)  Time elapsed: 0 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class com.megaease.easeagent.log4j2.MDC
	at com.megaease.easeagent.log4j2.impl.MDCTest.put(MDCTest.java:33)

Running com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest
Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 0.001 sec <<< FAILURE! - in com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest
newFactory(com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest)  Time elapsed: 0 sec  <<< ERROR!
java.lang.NullPointerException: Cannot invoke "com.megaease.easeagent.log4j2.api.AgentLoggerFactory.getLogger(String)" because "factory" is null
	at com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest.newFactory(AgentLoggerFactoryTest.java:61)

builder(com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest)  Time elapsed: 0 sec  <<< ERROR!
java.lang.NullPointerException: urls must not be null.
	at java.base/java.util.Objects.requireNonNull(Objects.java:233)
	at com.megaease.easeagent.log4j2.supplier.URLClassLoaderSupplier.get(URLClassLoaderSupplier.java:35)
	at com.megaease.easeagent.log4j2.supplier.URLClassLoaderSupplier.get(URLClassLoaderSupplier.java:26)
	at com.megaease.easeagent.log4j2.api.AgentLoggerFactory.builder(AgentLoggerFactory.java:49)
	at com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest.builder(AgentLoggerFactoryTest.java:45)

getLogger(com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest)  Time elapsed: 0.001 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class com.megaease.easeagent.log4j2.MDC
	at com.megaease.easeagent.log4j2.impl.AgentLoggerFactoryTest.getLogger(AgentLoggerFactoryTest.java:53)


Results :

Tests in error:
  AgentLoggerFactoryTest.builder:45 » NullPointer urls must not be null.
  AgentLoggerFactoryTest.getLogger:53 NoClassDefFound Could not initialize class...
  AgentLoggerFactoryTest.newFactory:61 NullPointer Cannot invoke "com.megaease.e...
  MDCTest.get:48 NoClassDefFound Could not initialize class com.megaease.easeage...
  MDCTest.put:33 NoClassDefFound Could not initialize class com.megaease.easeage...
  MDCTest.remove:40 ExceptionInInitializer

Tests run: 6, Failures: 0, Errors: 6, Skipped: 0

Proposal: easeagent plugins

Proposal: easeagent plugins

Abstract

  • Move built-in TracePoint to a separate jar which can be loaded by easeagent at JVM starts.

  • Provide tools for users to build their own TracePoint jars to extend easeagent.

    Inspired by btracec

Background

easeagent now has built-in JDK transformer backed TracePoints below

  • com.megaease.easeagent.requests.GenCaptureTrace
  • com.megaease.easeagent.requests.GenCaptureExecuteSql
  • com.megaease.easeagent.requests.GenCaptureHttpRequest
  • com.megaease.easeagent.zipkin.GenTraceHttpServlet
  • com.megaease.easeagent.zipkin.GenTraceHttpClient
  • com.megaease.easeagent.zipkin.GenTraceRestTemplate
  • com.megaease.easeagent.zipkin.GenTraceJedis
  • com.megaease.easeagent.zipkin.GenTraceJdbcStatement
  • com.megaease.easeagent.metrics.GenMeasureJdbcStatement
  • com.megaease.easeagent.metrics.GenMeasureJdbcGetConnection
  • com.megaease.easeagent.metrics.GenMeasureHttpRequest
  • com.megaease.easeagent.metrics.GenCaptureCaller

The TracePoints are generated by project gen along with template-like projects requests, zipkin and metrics.

Though easeagent have config files, it is still difficult for end users to customize or add new futures.
The proposal is to convert existing TracePoint code into script and consumed by a compiler which would yield JDK transformer .class. The generated transformer will be packed into a plugin format for easeagent

Details

easeagentc (was gen) and TracePoint script

The plugin compiler, aka easeagentc, should take some script as input, and yield plugin as output.

The script

The first version of the script spec might be the same as current TracePoint java file which implement com.megaease.easeagent.core.Transformation.

For example, current TraceHttpClient.java will be compiled by easeagentc into a plugin.

@Injection.Provider(Provider.class)
public abstract class TraceHttpClient implements Transformation {
    @Override
    public <T extends Definition> T define(Definition<T> def) {
        return def.type(hasSuperType(named("org.apache.http.impl.client.CloseableHttpClient")))
                  .transform(execute(ElementMatchers.<MethodDescription>named("doExecute")))
                  .end();
    }

    // ...
}

The better script spec will be discussed in a separate proposal later.
The new script might have Groovy or JVM based scripts support, and easeagent DSL for easier plugin coding.

Dependencies

plugin may or may not have dependencies. easeagentc should pack dependencies into the plugin file.

The dependencies can be described in a pom.xml or build.gradle

Plugin format and its loader

plugin Format

The format of plugin file can be a jar with futures below

  • libs: it should carry all dependency jars with it.
  • metadata: it should contain the info for easeagent to load and search for all the transformers

Dynamic load and plugin config

easeagent should support

  • search and load all visible plugins dynamically
  • plugin may have a config helper to load config provided by easeagent which may defined in a config file

Universe Reporting Stub (optional)

For performance purpose, the easeagent runtime could have reporting stub for plugin to report their data.
The stub should handle data in a ring buffer like cache and provide multiple interfaces to emit data to outside world.

In that case, the plugins may have less dependencies and fewer class conflict issues.

The design idea of the report stub is from ETW

Add middleware type in the tags of client span.

Monitor service rely on the remoteEndPoint.serviceName field to distinguish the type of remote service. If serviceName contains the keyword sql,oracle,db2, then monitor service think the remote service is a database. But sometimes the database name does not contain the keyword sql,oracle,db2.

Please add the following enumeration values to the span whose kind is client. The tags key is remote.type .

  • database
  • redis
  • kafka
  • rabbitmq
  • elasticsearch

Notes on the different configurations of the new and old versions of easeagent

background:

Starting from 2.0.0, easeagent will use plug-in solutions for monitoring. With the application of the plug-in solution, the configuration rules have also changed.
The old rules are configured as follows:

observability.metrics.enabled=true
# metrics access
observability.metrics.access.enabled=true
observability.metrics.access.interval=30
observability.metrics.access.topic=application-log
observability.metrics.access.appendType=kafka

The new rule configuration is as follows:

plugin.observability.global.metric.enabled=true
plugin.observability.access.metric.enabled=true
plugin.observability.access.metric.interval=30
plugin.observability.access.metric.topic=application-log
plugin.observability.access.metric.appendType=kafka

The migration of plugin will cause the old configuration to become invalid.
Since there are other control systems that use api to modify the configuration, it is not friendly to discard it directly.

action

We decided to be compatible with the old version of the api configuration interface, modify the configuration monitored by the api, and convert it to the new version of the configuration through rules.
observability.metrics.enabled to plugin.observability.global.metric.enabled
For new configurations in the future, please use the new version of the configuration rules.

Improve installation script

  • download the jar to user given path , with
  • initialized log4j2.xml and application.conf, and
  • the script for importing SSL cert to JRE's trust store

Operating the middleware's connection string

In some cases, such as configure test, scaling, or migration. we need to redirect the middleware access connection to another instance, so , we like to use JavaAgent to do this dynamically.

The following middleware needs to take care of.

  • JDBC
  • RabbitMQ
  • Kafka
  • Redis
  • Elasticsearch
  • others

Collect apm data by logging file

Instead of sending by HTTP, easeagent write metrics and requests data into logging files, and then collect by filebeats. It is more reliable.

Exception Handle

There are two principles that the implementation must be followed:

  • All enhancement implementation by the EaseAgent shouldn't throw any Checked/UnChecked Exception
  • All the level of logs wrote by enhancement implementation of the EaseAgent shouldn't higher than 'WARN' (WARN permitted)

Support to collect GC information

There tow solutions:

Get Last GcInfo

package com.sun.management;

import jdk.Exported;

@Exported
public interface GarbageCollectorMXBean extends java.lang.management.GarbageCollectorMXBean {
    GcInfo getLastGcInfo();
}
package com.sun.management;

@Exported
public class GcInfo implements CompositeData, CompositeDataView {
    public long getStartTime() { ... }
    public long getEndTime() { ... }
    public long getDuration() { ... }    
    public Map<String, MemoryUsage> getMemoryUsageBeforeGc() { ... }
    public Map<String, MemoryUsage> getMemoryUsageAfterGc() { ...  }
    ...
}

Set VMOption -XX:-PrintGCDetails in Runtime

package com.sun.management;

public interface HotSpotDiagnosticMXBean extends PlatformManagedObject {
    void dumpHeap(String var1, boolean var2) throws IOException;

    List<VMOption> getDiagnosticOptions();

    VMOption getVMOption(String var1);

    void setVMOption(String var1, String var2);
}

Pass extra information with HTTP Headers between threads

A request may reach the webserver with some extra information in its headers. When the webserver handles this request, it probably calls another web service. These calls would start in the current thread or the other threads. What we should do is to pass the extra information from the original request to the destination web service correctly across the threads.

data report module

Design and implement a data report module. The following features are needed:

  • Support sending metrics and trace data to Kafka.
  • Offer a loose coupling interface that other modules can use for data reporting.

Create Custom Logger to prevent conflict with user's application

The agent's log configuration conflicts with the user's application.

Easeagent use log4j to solve 2 things:

  • Output log info to easeaget.log
  • Send metrics information to kafka.

How to remove conflict

In tomcat

The internal logging for Apache Tomcat uses JULI, a packaged renamed fork of Apache Commons Logging that is hard-coded to use the java.util.logging framework. This ensures that Tomcat's internal logging and any web application logging will remain independent, even if a web application uses Apache Commons Logging.

Reference: https://tomcat.apache.org/tomcat-9.0-doc/logging.html

In pinpoint and skywalking

Pinoint and skywalking create custom Logger class. They use sender class to transfer trace and metric to collector server.

EaseAgent Solution

We should create custom logger class for logging agent info to prevent conflict with user's application. And create sender class for send metric information.

Reference:

Add a Simple tracing view UI

The quick start employs Promethues to view the effect of metrics. But there are no recommendations to view the effect of tracing.

Propose to add a simple web server, composed of only one source file, to show the tracing in YAML. This server is required only to collect and aggregate the trace in memory for the evaluation of easeagent. easeagent output tracing to the server directly without any broker. It's deliberately to show in YAML format because a product-level tracing viewer connects many facts and too heavy to deploy for a quick start evaluation.

health check, reflect different metrics collected by the agent, support different endpoint, like Prometheus

Health check api should output response status code 200 for ready to start accepting traffic.

Kubernetes

The kubelet uses liveness probes to know when to restart a container, and uses readiness probes to know when a container is ready to start accepting traffic. Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure.
Reference: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

SpringBoot 2.3

It supports K8s Probes (/actuator/health/liveness, /actuator/health/readiness)

Reference: https://docs.spring.io/spring-boot/docs/2.3.9.RELEASE/reference/html/production-ready-features.html#production-ready-kubernetes-probes

Prometheus

It use static_configs to config pull api. The api should response status 200 for ready to start accepting traffic. Additionall the api should output metric info(Format as Prometheus Data Model)

Currentlly, agent use Jolokia for JMX over HTTP. It output json format, But Prometheus can not recognizes json.

Replace JMX with HTTP Server

Agent use Jolokia for JMX over HTTP。

But chili has one thing we can't improve: Pepper can't customize the format of the return format. Because it packaged the output. It can not compatible with other formats. Such as compatible With Prometheus format. To solve the problem of custom formatting, we can use JDK's built-in HTTP Server.

New API

[PUT] /config-service

- Request
Content-Type: application/json
{
    [key1]:[value1],
    [key2]:[value2],
    .....
    [keyn]:[valuen],
}

 - example:
{
    "version": "123",
    "observability.outputServer.bootstrapServer=127.0.0.1": "9092",
    "observability.outputServer.timeout": "10000",
    "observability.outputServer.enabled": "true"
}

- Response 
    success: 
        http statusCode: 200

[PUT] /config-canary

- Reqeust
Content-Type: application/json
{
    [key1]:[value1],
    [key2]:[value2],
    .....
    [keyn]:[valuen],
}

- Response 
    success: 
        http statusCode: 200
    

[GET] /health

- Request
    Empty body

- Response 
    success: 
        http statusCode: 200

[GET] /prometheus/metrics

- Request
    Empty body

- Response 
    success: 
        http statusCode: 200

More comments request

I am wondering is it possible that we can add more comments and explanations for some critical components so that we can understand the purpose clearer?
As a freshman in the Java Agent area, I'd like to learn something from this project, but it is a bit hard for me to follow the code and design.

Reduce overhead of capturing call stack

The degree of overhead is up to two factors:

  1. the number of methods (or classes) would be captured,
  2. the TPS of requests

So, reduce overhead means to reduce those two factors.

Using configuration to reduce

In v0.2.x, there were configurations to reduce two factors:

  1. requests.trace.include_class_prefix_list and requests.trace.exclude_class_prefix_list (See CaptureTrace.java, AnyCall.java),
  2. requests.report.capture_rate of sampling (See CaptureHttpRequest.java, Provider.java)

Using command to reduce

[portal] -- start/stop sample --> [easeagent]

To achieve sending start/stop sample commands to easeagent from portal, we need to:

  1. make easeagent to listen and handle command by HTTP or JMX,
  2. let the portal keep addresses of all easeagents dynamical,

parameters management module

Design and implement a parameters management module. The following features are needed:

  • Export parameter switches by JMX and HTTP over JMX.
  • Support applying the modifications of parameter switches at runtime.

feign client default configuration cause the called service name lost

When using the feign.client.config.default.* properties, the name of the called service will be lost during the RPC process.
The original requirements are here (#53)

Using the following configuration to reproduce the bug

feign.client.config.default.connectTimeout=5000
feign.client.config.default.readTimeout=5000

IMO, the following codes are the key points

//org.springframework.cloud.openfeign.ribbon.LoadBalancerFeignClient
IClientConfig getClientConfig(Request.Options options, String clientName) {
  IClientConfig requestConfig;
  if (options == DEFAULT_OPTIONS) {
	  requestConfig = this.clientFactory.getClientConfig(clientName);
  }
  else {
	  requestConfig = new FeignOptionsClientConfig(options);
  }
  return requestConfig;
}

kafka should not connect when output is closed

When the configuration file is configured as follows:

observability.outputServer.bootstrapServer=172.1.1.2:9092,172.1.1.3:9092,172.1.1.3:9092
observability.outputServer.enabled=false

The agent should not connect to kafka anymore

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.