webdevops / azure-loganalytics-exporter Goto Github PK

9.0 4.0 3.0 158 KB

Prometheus exporter for Azure LogAnaylytics (Kusto queries)

License: MIT License

Dockerfile 2.19% Makefile 5.93% Go 62.43% HTML 29.45%

azure azure-cloud loganalytics loganalyticsworkspace prometheus-exporter prometheus-metrics prometheus-metrics-exporter kusto

azure-loganalytics-exporter's Introduction

Azure LogAnalytics exporter

Prometheus exporter for Azure LogAnalytics Kusto queries with configurable fields and transformations.

azure-loganalytics-exporter can query configured workspaces or all workspaces in one or multiple subscriptions. The exporter can also cache metrics and servicediscovery information to reduce requests against workspaces and Azure API.

Usage

Usage:
  azure-loganalytics-exporter [OPTIONS]

Application Options:
      --log.debug                     debug mode [$LOG_DEBUG]
      --log.trace                     trace mode [$LOG_TRACE]
      --log.json                      Switch log output to json format [$LOG_JSON]
      --azure.environment=            Azure environment name (default: AZUREPUBLICCLOUD) [$AZURE_ENVIRONMENT]
      --azure.servicediscovery.cache= Duration for caching Azure ServiceDiscovery of workspaces to reduce API calls
                                      (time.Duration) (default: 30m) [$AZURE_SERVICEDISCOVERY_CACHE]
      --loganalytics.workspace=       Loganalytics workspace IDs [$LOGANALYTICS_WORKSPACE]
      --loganalytics.concurrency=     Specifies how many workspaces should be queried concurrently (default: 5)
                                      [$LOGANALYTICS_CONCURRENCY]
  -c, --config=                       Config path [$CONFIG]
      --server.bind=                  Server address (default: :8080) [$SERVER_BIND]
      --server.timeout.read=          Server read timeout (default: 5s) [$SERVER_TIMEOUT_READ]
      --server.timeout.write=         Server write timeout (default: 10s) [$SERVER_TIMEOUT_WRITE]

Help Options:
  -h, --help                          Show this help message

for Azure API authentication (using ENV vars) see https://docs.microsoft.com/en-us/azure/developer/go/azure-sdk-authentication

Configuration file

see example.yaml

HTTP Endpoints

Endpoint	Description
`/query`	Query tester
`/metrics`	Default prometheus golang metrics
`/probe`	Execute loganalytics queries against workspaces (set on commandline/env var)
`/probe/workspace`	Execute loganalytics queries against workspaces (defined as parameter)
`/probe/subscription`	Execute loganalytics queries against workspaces (using servicediscovery)

HINT: parameters of type multiple can be either specified multiple times and/or splits multiple values by comma.

/probe parameters

uses predefined workspace list defined as parameter/environment variable on startup

GET parameter	Default	Required	Multiple	Description
`module`		no	no	Filter queries by module name
`cache`		no	no	Use of internal metrics caching (time.Duration)
`parallel`	`$LOGANALYTICS_PARALLEL`	no	no	Number (int) of how many workspaces can be queried at the same time

/probe/workspace parameters

uses dynamically passed workspaces via HTTP query parameter

GET parameter	Default	Required	Multiple	Description
`module`		no	no	Filter queries by module name
`workspace`		yes	yes	Workspace IDs which are probed
`cache`		no	no	Use of internal metrics caching (time.Duration)
`parallel`	`$LOGANALYTICS_PARALLEL`	no	no	Number (int) of how many workspaces can be queried at the same time

/probe/subscription parameters

uses Azure service discovery to find all workspaces in one or multiple subscriptions

GET parameter	Default	Required	Multiple	Description
`module`		no	no	Filter queries by module name
`subscription`		yes	yes	Uses all workspaces inside subscription
`filter`		no	no	Advanced filter for `resource \| {filter} \| project id, customerId=properties.customerId` ResoruceGraph query (available with `23.6.0`)
`cache`		no	no	Use of internal metrics caching (time.Duration)
`parallel`	`$LOGANALYTICS_PARALLEL`	no	no	Number (int) of how many workspaces can be queried at the same time
`optional`	`false`	no	no	Do not fail, if service discovery did not find any workspaces

Global metrics

available on /metrics

Metric	Description
`azure_loganalytics_status`	Status if query was successfull (per workspace, module, metric)
`azure_loganalytics_last_query_successfull`	Timestamp of last successfull query (per workspace, module, metric)
`azure_loganalytics_query_time`	Summary metric about query execution time (incl. all subqueries)
`azure_loganalytics_query_results`	Number of results from query
`azure_loganalytics_query_requests`	Count of requests (eg paged subqueries) per query
`azure_loganalytics_workspace_query_count`	Count of discovered workspaces per module

AzureTracing metrics

(with 22.2.0 and later)

Azuretracing metrics collects latency and latency from azure-sdk-for-go and creates metrics and is controllable using environment variables (eg. setting buckets, disabling metrics or disable autoreset).

Metric	Description
`azurerm_api_ratelimit`	Azure ratelimit metrics (only on /metrics, resets after query due to limited validity)
`azurerm_api_request_*`	Azure request count and latency as histogram

Settings

Environment variable	Example	Description
`METRIC_AZURERM_API_REQUEST_BUCKETS`	`1, 2.5, 5, 10, 30, 60, 90, 120`	Sets buckets for `azurerm_api_request` histogram metric
`METRIC_AZURERM_API_REQUEST_ENABLE`	`false`	Enables/disables `azurerm_api_request_*` metric
`METRIC_AZURERM_API_REQUEST_LABELS`	`apiEndpoint, method, statusCode`	Controls labels of `azurerm_api_request_*` metric
`METRIC_AZURERM_API_RATELIMIT_ENABLE`	`false`	Enables/disables `azurerm_api_ratelimit` metric
`METRIC_AZURERM_API_RATELIMIT_AUTORESET`	`false`	Enables/disables `azurerm_api_ratelimit` autoreset after fetch

`azurerm_api_request` label	Status	Description
`apiEndpoint`	enabled by default	hostname of endpoint (max 3 parts)
`routingRegion`	enabled by default	detected region for API call, either routing region from Azure Management API or Azure resource location
`subscriptionID`	enabled by default	detected subscriptionID
`tenantID`	enabled by default	detected tenantID (extracted from jwt auth token)
`resourceProvider`	enabled by default	detected Azure Management API provider
`method`	enabled by default	HTTP method
`statusCode`	enabled by default	HTTP status code

Examples

see example.yaml for general ingestion metrics (number of rows per second and number of bytes per second per table)

see example.aks-single.yaml for AKS namespace ingestion metrics (number of rows per second and number of bytes per AKS namespace) sending queryies to each LogAnalaytics workspace individually (single mode)

see example.aks-multi.yaml for AKS namespace ingestion metrics (number of rows per second and number of bytes per AKS namespace) sending one query against multiple LogAnalaytics workspaces (multi mode)

more examples of result processing can be found within azure-resourcegraph-expoter (uses same processing library)

Config file:

queries:
  - metric: azure_loganalytics_operationstatus_count
    query: |-
      Operation
      | summarize count() by OperationStatus
    fields:
      - name: count_
        type: value

Metrics:

# HELP azure_loganalytics_operationstatus_count azure_loganalytics_operationstatus_count
# TYPE azure_loganalytics_operationstatus_count gauge
azure_loganalytics_operationstatus_count{OperationStatus="Succeeded",workspaceId="xxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",workspaceTable="PrimaryResult"} 1

Prometheus configuration

predefined workspaces (at startup via parameter/environment variable)

- job_name: azure-loganalytics-exporter
  scrape_interval: 1m
  metrics_path: /probe
  params:
    cache: ["10m"]
    parallel: ["5"]
  static_configs:
  - targets: ["azure-loganalytics-exporter:8080"]

dynamic workspaces (defined in prometheus configuration)

- job_name: azure-loganalytics-exporter
  scrape_interval: 1m
  metrics_path: /probe/workspace
  params:
    workspace:
      - xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
      - xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
      - xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
    cache: ["10m"]
    parallel: ["5"]
  static_configs:
  - targets: ["azure-loganalytics-exporter:8080"]

find workspaces with servicediscovery via subscription

- job_name: azure-loganalytics-exporter
  scrape_interval: 1m
  metrics_path: /probe/subscription
  params:
    subscription:
      - xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
      - xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
      - xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx
    cache: ["10m"]
    parallel: ["5"]
  static_configs:
  - targets: ["azure-loganalytics-exporter:8080"]

azure-loganalytics-exporter's People

Contributors

Stargazers

Watchers

Forkers

abelal83 alex-popov-stenn jkroepke

azure-loganalytics-exporter's Issues

Feature Request: Passing header with requests to azure.

I would like to request for an option to include a "cache-control: no-cache" header in the request(s) being sent to azure. Azure loganalytics has a server-side caching interval of 2 minutes which can be managed by the http request header "cache-control". This would allow for near real-time metrics to be produced from this exporter when using scraping intervals of 1m or 2m. Such an option would increase its usability in observability solutions and give the exporter better control over the data it produces.

Source: https://docs.microsoft.com/en-us/azure/azure-monitor/logs/api/cache

ServiceMonitor Configuration Issues with Azure Log Analytics Exporter

I am experiencing difficulties with the Azure Log Analytics Exporter in a Kubernetes environment where ServiceMonitor resources are used to manage scraping configurations for Prometheus. Despite configuring separate probes with distinct workspaces, the exporter is querying metrics across all workspaces, leading to the collection of irrelevant metrics and numerous entries with a value of 0. This behaviour persists even after configuring the ServiceMonitor with specific parameters for each probe, intending to isolate the metric queries to their respective workspaces.

Steps to Reproduce:

Deploy the Azure Log Analytics Exporter in a Kubernetes cluster.
Configure multiple probes in the exporter, each with a unique workspace ID.
Set up ServiceMonitor resources for Prometheus to scrape metrics from the exporter, utilizing the probe-specific endpoints and workspace parameters.
Observe the scraped metrics in Prometheus, noting the presence of metrics from all workspaces in each probe's dataset.
Expected Behaviour:
Each probe should only fetch and return metrics relevant to its configured workspace, thereby preventing the mix-up of data across different workspaces.

Actual Behaviour:
All probes are fetching metrics from all workspaces, not just the ones they are configured to query, resulting in a large number of irrelevant metrics.

I suspect there may be a bug in the exporter’s handling of workspace-specific queries or in the ServiceMonitor configuration process that prevents the proper isolation of metrics per workspace.

Would you please help with how can figure out this issue ?

or even if the custom query doesn't have value don't set 0 value, just do not create it.

How to define timespan 'Set in query'

Hi,

I would like to know how I can define the timespan 'set in query' like this:

no azurerm_api_ratelimit metric in /metrics

Hi @mblaschke
i am not able to see azurerm_api_ratelimit metric in /metrics endpoint
so i set my env var
METRIC_AZURERM_API_RATELIMIT_AUTORESET true
METRIC_AZURERM_API_RATELIMIT_ENABLE true
still i am not able to see that
is there any specific reason for that?

implement static background fetching

Print error message when metrics doesn't have values

root case from #6

Values not showing up at /metrics endpoint

Hey guys,

I am testing your log analytics exporter (great job by the way). I am running some queries locally with the debug and trace flags and I see that my queries are fetching the right amount of results when I compare and run the same query in Log Analytics. However, these results are not shown on the /probe endpoint.

The stack trace showing that the query fetched results:

EBU[0004]/home/runner/work/azure-loganalytics-exporter/azure-loganalytics-exporter/loganalytics/prober.go:358 loganalytics.(*LogAnalyticsProber).executeQueries fetched 4 results metric=azure_metrics_loganalytics_exporter_missing_heartbeat module= results=4

The following config reproduces this issue:

- metric: azure_metrics_loganalytics_exporter_missing_heartbeat
    query: |-
      Heartbeat 
      | summarize LastHeartbeat=max(TimeGenerated) by Computer
      | where LastHeartbeat < ago(5h)
      | where Computer !contains_cs "avd"
    fields:
      - name: Computer
        type: id

Do you guys have any idea as to why this is happening? Is it because of a missing type: value field? Could not find any documentation on this so its mostly been trial and error, but I have managed to get it to work with multiple other queries.

Every pointer in the right direction is appreciated. Thanks.

Switch to new azure-sdk-for-go

Using Azure service discovery in K8s containers

Hello!

I have been attempting to configure the log analytics exporter to use Azure service discovery and detect all workspaces within multiple subscriptions in my Azure tenant.

I am struggling a bit to define this in my containers running in Kubernetes. Do you have any configuration examples of how to run the exporter like this?

I'm getting the following error message when attempting to reach the /probe/subscription endpoint locally:

ERROR: parameter "subscription" is missing

Which makes sense. I have tried defining the subscription list both as environment variables and arguments in the container, but it still fails.

Using the query tester when running the exporter with the binary on my local system - everything works fine. So basically what I can't seem to figure out is how to define the subscription parameter for the Kubernetes container.

Edit one:

Trying to run the container without subscriptions specified in the container itself and defining a prometheus job that scrapes the container with the subscription list gives me the following error in the container log:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x93a1d2]

goroutine 156 [running]:
github.com/webdevops/azure-loganalytics-exporter/loganalytics.(*LogAnalyticsProber).sendQueryToSingleWorkspace(_, _, {_, _}, {{0xc00020b868, {0xc000290780, 0x3, 0x3}, 0x0, {{0x0, ...}, ...}, ...}, ...}, ...)
	/go/src/github.com/webdevops/azure-loganalytics-exporter/loganalytics/prober.go:415 +0x312
github.com/webdevops/azure-loganalytics-exporter/loganalytics.(*LogAnalyticsProber).executeQueries.func1.2()
	/go/src/github.com/webdevops/azure-loganalytics-exporter/loganalytics/prober.go:282 +0xfe
created by github.com/webdevops/azure-loganalytics-exporter/loganalytics.(*LogAnalyticsProber).executeQueries.func1
	/go/src/github.com/webdevops/azure-loganalytics-exporter/loganalytics/prober.go:279 +0x638

The container is killed on every scrape.

Edit two:

I realized I forgot to set the queryMode to multi. After configuring the queryMode as multi to query all the discovered workspaces - but I'm getting pretty much the same error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x939807]

goroutine 83 [running]:
github.com/webdevops/azure-loganalytics-exporter/loganalytics.(*LogAnalyticsProber).sendQueryToMultipleWorkspace(_, _, {_, _, _}, {{0x0, {0xc000310000, 0x3, 0x3}, 0x0, ...}, ...}, ...)
	/go/src/github.com/webdevops/azure-loganalytics-exporter/loganalytics/prober.go:359 +0x387
github.com/webdevops/azure-loganalytics-exporter/loganalytics.(*LogAnalyticsProber).executeQueries.func1.1()
	/go/src/github.com/webdevops/azure-loganalytics-exporter/loganalytics/prober.go:264 +0xfe
created by github.com/webdevops/azure-loganalytics-exporter/loganalytics.(*LogAnalyticsProber).executeQueries.func1
	/go/src/github.com/webdevops/azure-loganalytics-exporter/loganalytics/prober.go:261 +0x245

The container/pod is still getting killed.