Coder Social home page Coder Social logo

cost-analyzer-helm-chart's Introduction

Kubecost Helm chart

This is the official Helm chart for Kubecost, an enterprise-grade application to monitor and manage Kubernetes spend. Please see the website for more details on what Kubecost can do for you and the official documentation here, or contact [email protected] for assistance.

Version Support

Kubecost strives to support as many versions of Kubernetes as possible. Below is the version support matrix which has been tested. Versions outside of the stated range may still work but are untested.

Chart Version Kubernetes Min Kubernetes Max
1.107 1.20 1.28
1.108 1.20 1.28
2.1 1.20 1.29
2.2 1.21 1.29
2.3 1.21 1.30
2.4 1.22 1.31

Installation

To install via Helm, run the following command.

helm upgrade --install kubecost -n kubecost --create-namespace \
  --repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
  --set kubecostToken="aGVsbUBrdWJlY29zdC5jb20=xm343yadf98"

Alternatively, add the Helm repository first and scan for updates.

helm repo add kubecost https://kubecost.github.io/cost-analyzer/
helm repo update

Next, install the chart.

helm install kubecost kubecost/cost-analyzer -n kubecost --create-namespace \
  --set kubecostToken="aGVsbUBrdWJlY29zdC5jb20=xm343yadf98"

While Helm is the recommended install path for Kubecost, especially in production, Kubecost can alternatively be deployed with a single-file manifest using the following command. Keep in mind when choosing this method, Kubecost will be installed from a development branch and may include unreleased changes. We recommend using the manifest from a release branch, such as v1.108.

kubectl apply -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/develop/kubecost.yaml

Common Parameters

The following table lists commonly used configuration parameters for the Kubecost Helm chart and their default values. Please see the values file for the complete set of definable values.

Parameter Description Default
global.prometheus.enabled If false, use an existing Prometheus install. More info. true
prometheus.server.persistentVolume.enabled If true, Prometheus server will create a Persistent Volume Claim. true
prometheus.server.persistentVolume.size Prometheus server data Persistent Volume size. Default set to retain ~6000 samples per second for 15 days. 32Gi
prometheus.server.persistentVolume.storageClass Define storage class for Prometheus persistent volume -
prometheus.server.retention Determines when to remove old data. 97h
prometheus.server.resources Prometheus server resource requests and limits. {}
prometheus.nodeExporter.resources Node exporter resource requests and limits. {}
prometheus.nodeExporter.enabled prometheus.serviceAccounts.nodeExporter.create If false, do not create NodeExporter daemonset. true
prometheus.alertmanager.persistentVolume.enabled If true, Alertmanager will create a Persistent Volume Claim. false
prometheus.pushgateway.persistentVolume.enabled If true, Prometheus Pushgateway will create a Persistent Volume Claim. false
persistentVolume.enabled If true, Kubecost will create a Persistent Volume Claim for product config data. true
persistentVolume.size Define PVC size for cost-analyzer 32.0Gi
persistentVolume.dbSize Define PVC size for cost-analyzer's flat file database 32.0Gi
persistentVolume.storageClass Define storage class for cost-analyzer's persistent volume -
ingress.enabled If true, Ingress will be created false
ingress.annotations Ingress annotations {}
ingress.className Ingress class name {}
ingress.paths Ingress paths ["/"]
ingress.hosts Ingress hostnames [cost-analyzer.local]
ingress.tls Ingress TLS configuration (YAML) []
kubecostModel.ingress.* Same as ingress.*, but will create an ingress directly to the model { enabled: false }
networkPolicy.enabled If true, create a NetworkPolicy to deny egress false
networkCosts.enabled If true, collect network allocation metrics More info false
networkCosts.podMonitor.enabled If true, a PodMonitor for the network-cost daemonset is created false
serviceMonitor.enabled Set this to true to create ServiceMonitor for Prometheus operator false
serviceMonitor.additionalLabels Additional labels that can be used so ServiceMonitor will be discovered by Prometheus {}
prometheusRule.enabled Set this to true to create PrometheusRule for Prometheus operator false
prometheusRule.additionalLabels Additional labels that can be used so PrometheusRule will be discovered by Prometheus {}
grafana.resources Grafana resource requests and limits. {}
grafana.sidecar.dashboards.enabled Set this to false to disable creation of Dashboards in Grafana true
grafana.sidecar.datasources.defaultDatasourceEnabled Set this to false to disable creation of Prometheus datasource in Grafana true
serviceAccount.create Set this to false if you want to create the service account kubecost-cost-analyzer on your own true
tolerations node taints to tolerate []
affinity pod affinity {}
extraVolumes A list of volumes to be added to the pod []
extraVolumeMounts A list of volume mounts to be added to the pod []

Adjusting Log Output

The log output can be customized during deployment by using the LOG_LEVEL and/or LOG_FORMAT environment variables.

Adjusting Log Level

Adjusting the log level increases or decreases the level of verbosity written to the logs. To set the log level to trace, the following flag can be added to the helm command.

--set 'kubecostModel.extraEnv[0].name=LOG_LEVEL,kubecostModel.extraEnv[0].value=trace'

Adjusting Log Format

Adjusting the log format changes the format in which the logs are output making it easier for log aggregators to parse and display logged messages. The LOG_FORMAT environment variable accepts the values JSON, for a structured output, and pretty for a nice, human-readable output.

Value Output
JSON {"level":"info","time":"2006-01-02T15:04:05.999999999Z07:00","message":"Starting cost-model (git commit \"1.91.0-rc.0\")"}
pretty 2006-01-02T15:04:05.999999999Z07:00 INF Starting cost-model (git commit "1.91.0-rc.0")

cost-analyzer-helm-chart's People

Contributors

ajaytripathy avatar ameijer avatar avrodrigues5 avatar biancaburtoiu avatar calvinwang avatar campfireremnants avatar chipzoller avatar cliffcolvin avatar dependabot[bot] avatar dramich avatar dwbrown2 avatar gracedo avatar ivankube avatar jessegoodier avatar kaelanspatel avatar keithhand avatar kirbsauce avatar linhlam-kc avatar mbolt35 avatar michaelmdresser avatar nealormsbee avatar nickcurie avatar nik-kc avatar nikovacevic avatar saweber avatar sean-holcomb avatar sebastien-prudhomme avatar srpomeroy avatar teevans avatar thomasvn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cost-analyzer-helm-chart's Issues

Improve persistence customization

Hi,

A good practice in Helm chart for persistence is to enable user to customize it further:

  • storageClass for people having multiple storage backends
  • annotations for instance for people using backup solutions that rely on volume annotations
  • accessModes
  • also using a preexisting volume claim is often possible in a lot of charts

So i created this issue to talk about that :-)

provide warning in the UI that use of non-valid Prometheus characters in label settings and/or filters can potentially merge with other labels

Describe the bug
A clear and concise description of what the bug is.
From the kubernetes spec:

Labels are key/value pairs. Valid label keys have two segments: an optional prefix and name, separated by a slash (/). The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character ([a-z0-9A-Z]) with dashes (-), underscores (_), dots (.), and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots (.), not longer than 253 characters in total, followed by a slash (/).

Prometheus only supports underscores in label names, so if you have an aggregation on label foo.bar and on foo-bar, they will both be merged to the label foo_bar in prometheus and therefore all costs will be charged to foo_bar in kubecost.

Expected behavior
Kubecost should reject invalid prometheus label characters as first class aggregation labels (owner, product, etc.)

Expose an option to disable persistent storage

For testing purposes, it makes it easier if I don't have to worry about provisioning persistent storage.

It would be nice to have an option in the helm values file to use emptyDir volumes instead of persistent volume claims.

Many-to-many matching labels issue in Cluster-level Grafana dashboard

@mdaniel encountered the following error message in 4 separate "500 Internal Server Error" responses from Grafana:

{"status":"error","errorType":"internal","error":"many-to-many matching not allowed: matching labels must be unique on one side"}

sum(
(
(
sum(kube_node_status_capacity_cpu_cores) by (node)

  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible="true"}
    ) * 5.10
    )
    or
    (
    (
    sum(kube_node_status_capacity_cpu_cores) by (node)
  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible!="true"}
    ) * (23.076 - (23.076 / 100 * 30))
    )
    )

CPU

sum(
(
(
sum(kube_node_status_capacity_cpu_cores) by (node)

  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible="true"}
    ) * 5.10
    )
    or
    (
    (
    sum(kube_node_status_capacity_cpu_cores) by (node)
  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible!="true"}
    ) * (23.076 - (23.076 / 100 * 30))
    )
    )

Storage

sum (
sum(kube_persistentvolumeclaim_info{storageclass=~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)

  • on (persistentvolumeclaim, namespace) group_right(storageclass)
    sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
    ) / 1024 / 1024 /1024 * .17

sum (
sum(kube_persistentvolumeclaim_info{storageclass!~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)

  • on (persistentvolumeclaim, namespace) group_right(storageclass)
    sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
    ) / 1024 / 1024 /1024 * 0.040

sum(container_fs_limit_bytes{id="/"}) / 1024 / 1024 / 1024 * 1.03 * 0.040

END STORAGE

RAM

sum(
(
(
sum(kube_node_status_capacity_memory_bytes) by (node)

  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible="true"}
    ) /1024/1024/1024 * 0.6862
    )
    or
    (
    (
    sum(kube_node_status_capacity_memory_bytes) by (node)
  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible!="true"}
    ) /1024/1024/1024 * (3.25 - (3.25 / 100 * 30))
    )
    )

#Network
SUM(rate(node_network_transmit_bytes_total{device="eth0"}[60m]) / 1024 / 1024 / 1024 ) * (60 * 60 * 24 * 30) * .12

sum(
(
(
sum(kube_node_status_capacity_memory_bytes) by (node)

  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible="true"}
    ) /1024/1024/1024 * 0.6862
    )
    or
    (
    (
    sum(kube_node_status_capacity_memory_bytes) by (node)
  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible!="true"}
    ) /1024/1024/1024 * (3.25 - (3.25 / 100 * 30))
    )
    )

CPU

sum(
(
(
sum(kube_node_status_capacity_cpu_cores) by (node)

  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible="true"}
    ) * 5.10
    )
    or
    (
    (
    sum(kube_node_status_capacity_cpu_cores) by (node)
  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible!="true"}
    ) * (23.076 - (23.076 / 100 * 30))
    )
    )

Storage

sum (
sum(kube_persistentvolumeclaim_info{storageclass=~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)

  • on (persistentvolumeclaim, namespace) group_right(storageclass)
    sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
    ) / 1024 / 1024 /1024 * .17

sum (
sum(kube_persistentvolumeclaim_info{storageclass!~".ssd."}) by (persistentvolumeclaim, namespace, storageclass)

  • on (persistentvolumeclaim, namespace) group_right(storageclass)
    sum(kube_persistentvolumeclaim_resource_requests_storage_bytes) by (persistentvolumeclaim, namespace) or up * 0
    ) / 1024 / 1024 /1024 * 0.040

sum(container_fs_limit_bytes{id="/"}) / 1024 / 1024 / 1024 * 1.03 * 0.040

END STORAGE

RAM

sum(
(
(
sum(kube_node_status_capacity_memory_bytes) by (node)

  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible="true"}
    ) /1024/1024/1024 * 0.6862
    )
    or
    (
    (
    sum(kube_node_status_capacity_memory_bytes) by (node)
  • on (node) group_left (label_cloud_google_com_gke_preemptible)
    kube_node_labels{label_cloud_google_com_gke_preemptible!="true"}
    ) /1024/1024/1024 * (3.25 - (3.25 / 100 * 30))
    )
    )

#Network
SUM(rate(node_network_transmit_bytes_total{device="eth0"}[60m]) / 1024 / 1024 / 1024 ) * (60 * 60 * 24 * 30) * .12

please mask off imagePullSecrets if there isn't one

https://github.com/kubecost/cost-analyzer-helm-chart/blob/v1.21.0/cost-analyzer/templates/cost-analyzer-deployment-template.yaml#L114-L115

      imagePullSecrets:
        - name: regcred

else kubelet whines oppressively:

W0505 06:20:09.240086    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:21:14.241350    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:22:35.240232    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:24:00.240323    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:25:30.240170    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:26:38.241061    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:28:02.240327    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:29:22.240383    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:30:32.242500    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:31:45.240160    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.
W0505 06:32:59.240258    6808 kubelet_pods.go:832] Unable to retrieve pull secret kubecost/regcred for kubecost/kubecost-cost-analyzer-57c4b8fd7f-pfx27 due to secret "regcred" not found.  The image pull may not succeed.

ideally containers would specify resource limits

As of this issue, none of the resources: blocks contain limits: clauses, meaning those containers can grow without bound

That's not an idea situation, and if the memory requests are only 55MB then setting a 256M upper bound seems like a perfectly reasonable starting point -- although the correct answer is influenced by observing the actual memory pressure of a running instance

Support custom AWS tags for cost allocation

Today, the commercial Kubecost product allocates out-of-cluster AWS costs by a fixed set of tags (e.g. kubernetes_namespace). We should instead allow these to be easily configured on the frontend.

We should also support the ability to have multiple tags per individual Kubernetes concept (e.g. k8s_namespace and k8s/ns). This part can be considered out of scope for this issue if necessary.

Add totals to cost allocation page

Talking to a user, it would be helpful for us to show column totals on the cost allocation page. Otherwise, teams have to export data to excel in order to determine total spend by asset class across all namespaces, etc.

Unexpected missing PVC data causing unsafe access

AWS Deployment missing region

Describe the bug
Error in the logs cost-analyzer-7d84df98f9-bwvwn:cost-model I0909 12:58:15.784836 1 awsprovider.go:647] Skipping AWS spot data download: MissingRegion: could not find region configuration

To Reproduce
Deploy via helm to AWS

Expected behavior
The ability to discover or set a region

Make container image registries customisable

Could you make the docker images configurable through the values file? Where I work I can’t pull public images directly it would be nice to be able to change it in the values file instead of having to go through the templates every time.

Consider Using Prometheus statefulsets by default

It should already work and implementation would be as simple as setting
server.statefulSet.enabled in the prometheus block of values.yaml

Considerations would be statefulsets were only GA as of k8s 1.9

Alternative Chart - NFS Volumes

What problem are you trying to solve?
Looking to run kubecost on-premise (referencing cloud costs) as training for teams to see cloud costs representation, and test out kube-cost capabilities for an enterprise offering that will run partially internally.

Describe the solution you'd like
Separate kube-cost yaml/chart for with common NFS storage-provider resources added.

Ex.

spec:
  # Add the server as an NFS volume for the pod
  volumes:
    - name: {{volume-name}}
      nfs: 
        # URL for the NFS server
        server: {{ server_host_variable}} # Change this!
        path: {{path_variable}}

Describe alternatives you've considered
Forking repository and updating existing kube-cost.yaml file to use nfs configuration.

How would users interact with this feature?
This would just allow developers the ability to deploy to on-premise and local clusters without a storage driver. It is relatively simple to create a set of NFS servers for a small cluster and stand up apps to utilize it. Using no complicated storage providers, cloud or cluster filesystems like gluster/ceph.

Allow to create grafana dashboards configmaps

Allow creating grafana dashboards config maps even if grafana is not installed with this chart. If we already have grafana installed and we create this config maps, on restart it should take them as well, no?

Upgrade libraries

Seeing new versions of urllib and openssl that we can upgrade to.

Start i18n work

The first feature request is to show cost in local currencies.

Single point of configuration for global tolerations and node selector

What problem are you trying to solve?

We at Astronomer (astronomer.io) use node taints in combination with node-affinity and tolerations to organize components in node pools. In our case, this is because we want multi-tenant components on separate node pool(s) from our platform components. We hope to use Kubecost in our platform components.

When I say 'node selector', I am actually referring to nodeAffinity + nodeSelectorTerms, which is the 'new and improved' way of doing node selectors.

Describe the solution you'd like

I would like for there to be a global configuration at the top-level values.yaml
example, node selectors:

global:
  nodeSelectors:
    "astronomer.io/multi-tenant": "false"
    "astronomer.io/another-one": "ok"

I want this to end up on the containers using affinity.

# in the container spec
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: "astronomer.io/multi-tenant"
              operator: In
              values:
              - "false"
          - matchExpressions:
            - key: "astronomer.io/another-one"
              operator: In
              values:
              - "ok"

example, tolerations:
values.yaml

global:
      tolerations:
      - key: "platform"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"

outcome:

# (output of kubectl describe pod command on any component)
Tolerations:     
                 platform=true:NoSchedule

Describe alternatives you've considered

I have noticed some configurations like this exist in the subcharts. I will try to do it by configuring each sub-chart appropriately by passing the values from the top-level chart to the subcharts.

How would users interact with this feature?

helm values.

Go binary deployed as +dirty

The go binary in the production docker images is showing up as +dirty?

I0430 19:44:55.279596       1 main.go:202] Starting cost-model (git commit "bd779830c98be5b101f2d9f0fe9b1e1f1fcea78f+dirty")```

Move Grafana deployment dashboard to relative time window

Deployment/Statefulset/Daemonset utilization metrics is installed with a preset absolute time range of Feb 5 to Feb 19, 2019. Obviously this is not a generally helpful default. Please change the default to a relative time frame. I suggest "Last 7 days".

Move Grafana Dashboards to Files (JSON)

what

  • Can we move these dashboards to files instead then modify the chart to include them?

image

why

this will make it easier for

  • (a) you to maintain them since you can just drop a file in without packing it all on one line
  • (b) for others use them if not using your grafana, since grafana supports pulling dashboards from a URL, so we could pull them using the github raw url

references

Generated Service name is incorrect

In https://github.com/kubecost/cost-analyzer-helm-chart/pull/76/files, it looks like the name of the service was changed to be hard-coded to kubecost-cost-analyzer, however that goes against the Helm norms where the release name is part of the generated service name.

It also broke the ingress definition since the ingress spec does use the generated / full name for the target service which no longer matches:

Spot instance estimates not getting shown correctly in realtime view

Describe the bug
In certain cases, spot node labels are not getting picked up correctly on the real-time view. In these cases, getConfig() is not correctly reporting what's provided in settings.

This should be easy to track down, because it's not related to the cost-model directly. Let's push for a fix in our next build.

Ingress spec is hard to use

Several users have reported our Ingress as hard to use. Here are my initial ideas on improving:

  1. Add suggested paths or make these not required.
  2. Is default servicePort correct?
  3. Make it easy to connect this to a NodePort service
  4. In comments, let users know they don't need to route to 9001 and 9003 directly, because kubecost ngnix will handle.

`ServiceMonitor` doesn't seem to work by default

Once the service monitor template is rendered, the selector includes an app: cost-analyzer label that is not included in the set of common labels, so it does not work out of the box.

Create the common labels.
*/}}
{{- define "cost-analyzer.commonLabels" -}}
app.kubernetes.io/name: {{ include "cost-analyzer.name" . }}
helm.sh/chart: {{ include "cost-analyzer.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}
{{/*
Create the selector labels.
*/}}
{{- define "cost-analyzer.selectorLabels" -}}
app.kubernetes.io/name: {{ include "cost-analyzer.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app: cost-analyzer

I've confirmed that either removing the selector from the ServiceMonitor or adding the label to the service fixes the problem.

New recording rules for memory usage

This should make it much easier to look at trends over time for these metrics. Here are my proposals:

record: kubecost_container_memory_working_set_bytes
Expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""}) by (container_name,pod_name,namespace)

record: kubecost_cluster_memory_working_set_bytes
Expr: sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})

[prometheus] Create ServiceMonitor and PrometheusRule resources

Because multiple applications installed in a single cluster all want to have visibility via Prometheus, Grafana, and Alertmanager, CoreOS has developed the Prometheus-Operator chart which installs a single instance of these tools that can be shared by all the applications in the cluster. Following the principle of Kubernetes that each application should specify its own resource needs, Prometheus Operator includes new Custom Resources that allow applications to add configuration items to Prometheus, Grafana, and Alertmanager.

I am grateful that you did this work for the Grafana dashboards and ask that you now do it for ServiceMonitors (which replace scrape_configs) and PrometheusRules.

One tricky thing is that ServiceMonoitors are meant to monitor Services, although technically they monitor Endpoints that are typically automatically created by Kubernenets for Services. The important point is that ServiceMonitors target endpoints to scrape via LabelSelectors, which means your Services need to have stable and unique labels. Currently, your kubecost-cost-analyzer Service, which is what you target with your custom scrape_config and I now want you to target with a ServiceMonitor, only has 1 label: chart: cost-analyzer-1.26.0. This is not robust enough for this purpose. I suggest you add the app: cost-analyzer label to it as well. With that, the following ServiceMonitor should work:

kind: ServiceMonitor
metadata:
  labels:
    app: prometheus-operator
    release: prometheus-operator
  name: cost-analyzer-model
spec:
  endpoints:
  - honorLabels: true
    interval: 1m
    metrics_path: /metrics
    port: cost-analyzer-model
    scheme: http
    scrapeTimeout: 10s
  selector:
    matchLabels:
      app: cost-analyzer

I don't know if you have PrometheusRules meant for publishing. If the ones you have are just examples so your Prometheus installation is not empty, then they do not need to be converted to PrometheusRule resources. However, if you have some you want everyone to use, then please install them as PrometheusRule resources rather than as configuration files.

Implement workaround for https://github.com/helm/helm/issues/3742

We use Reckoner to manage our infrastructure charts, and this particular issue blocks us being able to use it for installing kubecost. There is a workaround for the issue here that would be nice to have.

I completely understand if you do not want to implement the workaround, but I thought I would request.

Postgres PVC Template missing

Describe the bug
cff5c39 introduced a persistent volume claim to the postgres deployment, but there is no associate PVC creation. This blocks the deployment from creating.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy the charts with the remoteWrite.postgres.enabled set to true

Expected behavior
I would expect there to be a PVC template either inline with the deployment or as a separate file in a manner similar to the cost-analyzer PVC template.

Deleting clusters is not working properly

In v1.35, there's a regression that prevents deleted clusters from being properly removed from local storage. The result is that clusters are removed from the DOM but then unexpectedly reappear after page reload.

Currency Change request

Hi,

I need price listings in €(EURO), I am able to enable Custom Pricing and able to change it to euro. But some of the tabs still needs to be modified as below:

1). The Savings panel/tab still showing price details in $(Dollar).

2). The 'Switch Clusters' tab showing the total price / month in $ instead of €.

3). After exporting the price to CSV files via allocation tab, The CSV is taking garbage value of '€' instead of '€'.

Kindly address these fix in your next release.

cost-analyzer-checks resource limit is too low

Ironic, given #12 but one can see in the Reason: code that node is being OOMKilled

  cost-analyzer-checks:
    Container ID:  docker://d9d86367a7c7b54e22331f2bc6db2178a94667239105d491f3bde9877e4b4171
    Image:         ajaytripathy/kubecost-checks:prod-1.18.2
    Image ID:      docker-pullable://ajaytripathy/kubecost-checks@sha256:f9c45b42c8facd366a0515544f2a9fbfc8b75af8dea5e54f29bcbd4ecbdfeff8
    Port:          <none>
    Host Port:     <none>
    Args:
      node
      ./node/cron.js
    State:          Terminated
      Reason:       OOMKilled
      Exit Code:    0
      Started:      Tue, 30 Apr 2019 15:00:21 -0700
      Finished:     Tue, 30 Apr 2019 15:01:48 -0700
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     10m
      memory:  55M
    Requests:
      cpu:     10m
      memory:  55M

Service should use fullname

Installed via helm service currently does not use the fullname helper. This results in cost-analyzer-cost-analyzer if the release is the same as the chart vs the proper cost-analyzer

version 1.41.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.