datainfrahq / druid-operator Goto Github PK

View Code? Open in Web Editor NEW

97.0 97.0 39.0 65.54 MB

Apache Druid On Kubernetes

License: Other

Dockerfile 0.63% Makefile 5.43% Go 86.63% Smarty 3.13% Shell 4.18%

druid golang kubernetes kubernetes-operator

druid-operator's People

Contributors

Stargazers

Watchers

druid-operator's Issues

Can only run 1 task at a time?

I followed the installation steps to create the operator and tiny cluster.

helm repo add datainfra https://charts.datainfra.io
helm repo update
kubectl create namespace druid
helm -n druid-operator-system upgrade -i --create-namespace --set env.WATCH_NAMESPACE="druid" namespaced-druid-operator datainfra/druid-operator
helm -n druid-operator-system upgrade -i --create-namespace --set env.DENY_LIST="kube-system" namespaced-druid-operator datainfra/druid-operator

kubectl apply -f tiny-cluster-zk.yaml -n druid
kubectl apply -f tiny-cluster.yaml -n druid

The big problem is I can only run 1 task at a time. For example, if I'm running a kafka supervisor, then all other tasks such as index_parallel or other index_kafka tasks are stuck in a "pending" status.
I never had this problem with the previous druid-io repo. Any help is appreciated.

Example of pending kafka task:

{
  "id": "index_kafka_crypto_bulk_15m_6fa92e3a918604c_kfcgggfh",
  "groupId": "index_kafka_crypto_bulk_15m",
  "type": "index_kafka",
  "createdTime": "2023-09-15T03:58:34.787Z",
  "queueInsertionTime": "1970-01-01T00:00:00.000Z",
  "statusCode": "RUNNING",
  "status": "RUNNING",
  "runnerStatusCode": "PENDING",
  "duration": -1,
  "location": {
    "host": null,
    "port": -1,
    "tlsPort": -1
  },
  "dataSource": "crypto_bulk_15m",
  "errorMsg": null
}

x86 image for 1.2.0 missing from docker hub

The 1.2.0 release on Docker Hub only has an arm64 image. Can you please release the x86 image as well?

deleteOrphanPvc deleted PVC in use

I'm trying to create a Druid cluster using druid-operator on AWS EKS. I'm using EBS GP2 for the persistent volume.

When trying to scale up the historical pods (e.g. 4 to 8), the first pod stuck in pending, and the rest 7 pods working fine. The first pvc was mistakenly deleted as orphan PVC even though it is still in use.

druid-operator log:
1.6798315940261655e+09 INFO druid_operator_handler Deleted orphaned pvc [data-volume-druid-workload-historicals-4:default] successfully {"name": "workload", "namespace": "default"}
1.679831594026486e+09 DEBUG events Normal {"object": {"kind":"Druid","namespace":"default","name":"workload","uid":"2c6b92b9-73cb-408f-a670-a3ee7fc307ff","apiVersion":"druid.apache.org/v1alpha1","resourceVersion":"3088566"}, "reason": "DruidOperatorDeleteSuccess", "message": "Successfully deleted object [data-volume-druid-workload-historicals-4:PersistentVolumeClaim] in namespace [default]"}

This issue is reproducible in the following environments:
druid-operator (0.0.9), kubernetes (1.23).

Storage Class:
Name: gp2
IsDefaultClass: Yes
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"},"name":"gp2"},"parameters":{"fsType":"ext4","type":"gp2"},"provisioner":"kubernetes.io/aws-ebs","volumeBindingMode":"WaitForFirstConsumer"}
,storageclass.kubernetes.io/is-default-class=true
Provisioner: kubernetes.io/aws-ebs
Parameters: fsType=ext4,type=gp2
AllowVolumeExpansion:
MountOptions:
ReclaimPolicy: Delete
VolumeBindingMode: WaitForFirstConsumer
Events:

Documentation arrangement

We need to write much better documentation to the operator and arranged topics and order.
@AdheipSingh should we use mkdocs or do you want to do some integration between the project to Datainfra?

Operator does not seem to respect the recommended upgrade order

When performing upgrades, the Operator appears to restart all of the deployments/daemonsets at the same time, rather than respecting the recommended order as defined here: https://druid.apache.org/docs/latest/operations/rolling-updates.html

Is there any way to change this behavior?

Use predicate to filter out CR Specs NOT to be reconciled

Use predicate to filter out CR Specs NOT to be reconciled #7

Document running druid without zk

Using Kubebuilder markers for object validation

Kubebuilder supports object validation markers. Instead of validating the object inside the reconcile function (in the verifyDruidSpec function, we should let Kubebuilder do it.
There are some validations that we can do with markers (like the cluster level image vs node level image) but those should be in a Kubernetes admission validating webhook.

Enhance e2e tests

E2e tests - deploy a full cluster using kind.
TODO

indexing job
query datasource

No amd64 docker image for 1.1.1

When attempting to upgrade from 1.1.0 to 1.1.1.

I0518 19:10:59.514979 1 main.go:218] Valid token audiences:
I0518 19:10:59.515043 1 main.go:344] Generating self signed cert as no cert is provided
I0518 19:11:00.480061 1 main.go:394] Starting TCP socket on 0.0.0.0:8443
I0518 19:11:00.480301 1 main.go:401] Listening securely on 0.0.0.0:8443
exec /manager: exec format error

I just checked Docker Hub and it looks like the image was only built for arm64. There is no amd64 image.

Help getting started with `tiny-cluster`

Hi, I am trying to set up druid-operator in minikube (for now) which is running on an EC2.

Here's what I have done:

cd druid-operator
git checkout -b v1.0.0 v1.0.0

k create ns druid-ns
k config set-context --current --namespace=druid-ns

k create -f deploy/service_account.yaml
k create -f deploy/role.yaml
k create -f deploy/role_binding.yaml
k create -f deploy/crds/druid.apache.org_druids.yaml
k create -f deploy/operator.yaml

# k apply -f examples/tiny-cluster-zk.yaml
k apply -f examples/tiny-cluster.yaml

I initially started with https://github.com/datainfrahq/druid-operator/blob/master/docs/getting_started.md but that was not bringing up all the services for me but the above did.

Now I wish to access the web console.

Here's the output of k get all:

$ k get all
NAME                                    READY   STATUS             RESTARTS        AGE
pod/druid-operator-7ccbfc66b-2mm7q      1/1     Running            0               6m36s
pod/druid-tiny-cluster-brokers-0        0/1     Running            1 (2m49s ago)   6m30s
pod/druid-tiny-cluster-coordinators-0   0/1     Running            3 (69s ago)     6m30s
pod/druid-tiny-cluster-historicals-0    0/1     CrashLoopBackOff   5 (2m26s ago)   6m30s
pod/druid-tiny-cluster-routers-0        1/1     Running            0               6m30s

NAME                                      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/druid-tiny-cluster-brokers        ClusterIP   None         <none>        8088/TCP   6m30s
service/druid-tiny-cluster-coordinators   ClusterIP   None         <none>        8088/TCP   6m30s
service/druid-tiny-cluster-historicals    ClusterIP   None         <none>        8088/TCP   6m30s
service/druid-tiny-cluster-routers        ClusterIP   None         <none>        8088/TCP   6m30s

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/druid-operator   1/1     1            1           6m36s

NAME                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/druid-operator-7ccbfc66b   1         1         1       6m36s

NAME                                               READY   AGE
statefulset.apps/druid-tiny-cluster-brokers        0/1     6m30s
statefulset.apps/druid-tiny-cluster-coordinators   0/1     6m30s
statefulset.apps/druid-tiny-cluster-historicals    0/1     6m30s
statefulset.apps/druid-tiny-cluster-routers        1/1     6m30s

I tried running:

k port-forward service/druid-tiny-cluster-routers --address 0.0.0.0 8888:8088

And then tried to access the console using http://ec2-public-ip:8888 but that did not work.

I am pretty sure I'm doing something wrong but I am a k8s beginner so any help would be really great.

HostNetwork to the historical nodes

Dear community,

Is there any to add HostNetwork: true to the pod specs for historical service?

Thank You!

Example single node cluster fails to apply

Running on kubernetes 1.25.9 through Docker Desktop local cluster mode.

$ kubectl apply -f examples/tiny-cluster.yaml
Error from server (BadRequest): error when creating "examples/tiny-cluster.yaml": Druid in version "v1alpha1" cannot be handled as a Druid: strict decoding error: unknown field "spec.nodes.brokers.volumeClaimTemplates[0].metadata.name"

Pulled the latest commit off the main branch of the repo to apply the operator (e010411).

Dump of the pod object:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/default-container: manager
  creationTimestamp: "2023-05-03T13:32:21Z"
  generateName: druid-operator-controller-manager-f4bf77f54-
  labels:
    control-plane: controller-manager
    pod-template-hash: f4bf77f54
  name: druid-operator-controller-manager-f4bf77f54-tw4wd
  namespace: druid-operator-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: druid-operator-controller-manager-f4bf77f54
    uid: 5311981e-0ced-4a2d-96c3-f6d76aeb41bf
  resourceVersion: "27927"
  uid: 260ab173-6aaa-49c2-9003-db8b59e9e552
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/arch
            operator: In
            values:
            - amd64
            - arm64
            - ppc64le
            - s390x
          - key: kubernetes.io/os
            operator: In
            values:
            - linux
  containers:
  - args:
    - --secure-listen-address=0.0.0.0:8443
    - --upstream=http://127.0.0.1:8080/
    - --logtostderr=true
    - --v=0
    image: gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1
    imagePullPolicy: IfNotPresent
    name: kube-rbac-proxy
    ports:
    - containerPort: 8443
      name: https
      protocol: TCP
    resources:
      limits:
        cpu: 500m
        memory: 128Mi
      requests:
        cpu: 5m
        memory: 64Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-z75tr
      readOnly: true
  - args:
    - --health-probe-bind-address=:8081
    - --metrics-bind-address=127.0.0.1:8080
    - --leader-elect
    command:
    - /manager
    image: datainfrahq/druid-operator:latest
    imagePullPolicy: Always
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /healthz
        port: 8081
        scheme: HTTP
      initialDelaySeconds: 15
      periodSeconds: 20
      successThreshold: 1
      timeoutSeconds: 1
    name: manager
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /readyz
        port: 8081
        scheme: HTTP
      initialDelaySeconds: 5
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        cpu: 500m
        memory: 128Mi
      requests:
        cpu: 10m
        memory: 64Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-z75tr
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: docker-desktop
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    runAsNonRoot: true
  serviceAccount: druid-operator-controller-manager
  serviceAccountName: druid-operator-controller-manager
  terminationGracePeriodSeconds: 10
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-z75tr
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T13:32:21Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T13:32:32Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T13:32:32Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-05-03T13:32:21Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://36d70093e7494a286098150203296ebb247ad207d9f3a0df0b5ffa7df9d7cf97
    image: gcr.io/kubebuilder/kube-rbac-proxy:v0.13.1
    imageID: docker-pullable://gcr.io/kubebuilder/kube-rbac-proxy@sha256:d4883d7c622683b3319b5e6b3a7edfbf2594c18060131a8bf64504805f875522
    lastState: {}
    name: kube-rbac-proxy
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-05-03T13:32:22Z"
  - containerID: docker://dc215fd45f7bef2036c832184c8b4a8ec37e986807a516b3784cc804933bae6d
    image: datainfrahq/druid-operator:latest
    imageID: docker-pullable://datainfrahq/druid-operator@sha256:c5dc3f12f28695fea7c3849ffd4e83a729b5b049a0fa13f20ad7904c58410256
    lastState: {}
    name: manager
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-05-03T13:32:26Z"
  hostIP: 192.168.65.4
  phase: Running
  podIP: 10.1.0.126
  podIPs:
  - ip: 10.1.0.126
  qosClass: Burstable
  startTime: "2023-05-03T13:32:21Z"

Dump of the current CRD object I'm trying to apply:

# This spec only works on a single node kubernetes cluster(e.g. typical k8s cluster setup for dev using kind/minikube or single node AWS EKS cluster etc)
# as it uses local disk as "deep storage".
#
apiVersion: "druid.apache.org/v1alpha1"
kind: "Druid"
metadata:
  name: tiny-cluster
spec:
  image: apache/druid:25.0.0
  # Optionally specify image for all nodes. Can be specify on nodes also
  # imagePullSecrets:
  # - name: tutu
  startScript: /druid.sh
  podLabels:
    environment: stage
    release: alpha
  podAnnotations:
    dummykey: dummyval
  readinessProbe:
    httpGet:
      path: /status/health
      port: 8088
  securityContext:
    fsGroup: 1000
    runAsUser: 1000
    runAsGroup: 1000
  services:
    - spec:
        type: ClusterIP
        clusterIP: None
  commonConfigMountPath: "/opt/druid/conf/druid/cluster/_common"
  jvm.options: |-
    -server
    -XX:MaxDirectMemorySize=10240g
    -Duser.timezone=UTC
    -Dfile.encoding=UTF-8
    -Dlog4j.debug
    -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
    -Djava.io.tmpdir=/druid/data
  log4j.config: |-
    <?xml version="1.0" encoding="UTF-8" ?>
    <Configuration status="WARN">
        <Appenders>
            <Console name="Console" target="SYSTEM_OUT">
                <PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/>
            </Console>
        </Appenders>
        <Loggers>
            <Root level="info">
                <AppenderRef ref="Console"/>
            </Root>
        </Loggers>
    </Configuration>
  common.runtime.properties: |

    # Zookeeper
    druid.zk.service.host=tiny-cluster-zk-0.tiny-cluster-zk
    druid.zk.paths.base=/druid
    druid.zk.service.compress=false

    # Metadata Store
    druid.metadata.storage.type=derby
    druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/druid/data/derbydb/metadata.db;create=true
    druid.metadata.storage.connector.host=localhost
    druid.metadata.storage.connector.port=1527
    druid.metadata.storage.connector.createTables=true

    # Deep Storage
    druid.storage.type=local
    druid.storage.storageDirectory=/druid/deepstorage
    #
    # Extensions
    #
    druid.extensions.loadList=["druid-kafka-indexing-service"]

    #
    # Service discovery
    #
    druid.selectors.indexing.serviceName=druid/overlord
    druid.selectors.coordinator.serviceName=druid/coordinator

    druid.indexer.logs.type=file
    druid.indexer.logs.directory=/druid/data/indexing-logs
    druid.lookup.enableLookupSyncOnStartup=false

  metricDimensions.json: |-
    {
      "query/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer"},
      "query/bytes" : { "dimensions" : ["dataSource", "type"], "type" : "count"},
      "query/node/time" : { "dimensions" : ["server"], "type" : "timer"},
      "query/node/ttfb" : { "dimensions" : ["server"], "type" : "timer"},
      "query/node/bytes" : { "dimensions" : ["server"], "type" : "count"},
      "query/node/backpressure": { "dimensions" : ["server"], "type" : "timer"},
      "query/intervalChunk/time" : { "dimensions" : [], "type" : "timer"},

      "query/segment/time" : { "dimensions" : [], "type" : "timer"},
      "query/wait/time" : { "dimensions" : [], "type" : "timer"},
      "segment/scan/pending" : { "dimensions" : [], "type" : "gauge"},
      "query/segmentAndCache/time" : { "dimensions" : [], "type" : "timer" },
      "query/cpu/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer" },

      "query/count" : { "dimensions" : [], "type" : "count" },
      "query/success/count" : { "dimensions" : [], "type" : "count" },
      "query/failed/count" : { "dimensions" : [], "type" : "count" },
      "query/interrupted/count" : { "dimensions" : [], "type" : "count" },
      "query/timeout/count" : { "dimensions" : [], "type" : "count" },

      "query/cache/delta/numEntries" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/sizeBytes" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/hits" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/misses" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/evictions" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/hitRate" : { "dimensions" : [], "type" : "count", "convertRange" : true },
      "query/cache/delta/averageBytes" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/timeouts" : { "dimensions" : [], "type" : "count" },
      "query/cache/delta/errors" : { "dimensions" : [], "type" : "count" },

      "query/cache/total/numEntries" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/sizeBytes" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/hits" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/misses" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/evictions" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/hitRate" : { "dimensions" : [], "type" : "gauge", "convertRange" : true },
      "query/cache/total/averageBytes" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/timeouts" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/total/errors" : { "dimensions" : [], "type" : "gauge" },

      "ingest/events/thrownAway" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/events/unparseable" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/events/duplicate" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/events/processed" : { "dimensions" : ["dataSource", "taskType", "taskId"], "type" : "count" },
      "ingest/events/messageGap" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "ingest/rows/output" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/persists/count" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/persists/time" : { "dimensions" : ["dataSource"], "type" : "timer" },
      "ingest/persists/cpu" : { "dimensions" : ["dataSource"], "type" : "timer" },
      "ingest/persists/backPressure" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "ingest/persists/failed" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/handoff/failed" : { "dimensions" : ["dataSource"], "type" : "count" },
      "ingest/merge/time" : { "dimensions" : ["dataSource"], "type" : "timer" },
      "ingest/merge/cpu" : { "dimensions" : ["dataSource"], "type" : "timer" },

      "ingest/kafka/lag" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "ingest/kafka/maxLag" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "ingest/kafka/avgLag" : { "dimensions" : ["dataSource"], "type" : "gauge" },

      "task/success/count" : { "dimensions" : ["dataSource"], "type" : "count" },
      "task/failed/count" : { "dimensions" : ["dataSource"], "type" : "count" },
      "task/running/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "task/pending/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "task/waiting/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },

      "taskSlot/total/count" : { "dimensions" : [], "type" : "gauge" },
      "taskSlot/idle/count" : { "dimensions" : [], "type" : "gauge" },
      "taskSlot/busy/count" : { "dimensions" : [], "type" : "gauge" },
      "taskSlot/lazy/count" : { "dimensions" : [], "type" : "gauge" },
      "taskSlot/blacklisted/count" : { "dimensions" : [], "type" : "gauge" },

      "task/run/time" : { "dimensions" : ["dataSource", "taskType"], "type" : "timer" },
      "segment/added/bytes" : { "dimensions" : ["dataSource", "taskType"], "type" : "count" },
      "segment/moved/bytes" : { "dimensions" : ["dataSource", "taskType"], "type" : "count" },
      "segment/nuked/bytes" : { "dimensions" : ["dataSource", "taskType"], "type" : "count" },

      "segment/assigned/count" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/moved/count" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/dropped/count" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/deleted/count" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/unneeded/count" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/unavailable/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "segment/underReplicated/count" : { "dimensions" : ["dataSource", "tier"], "type" : "gauge" },
      "segment/cost/raw" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/cost/normalization" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/cost/normalized" : { "dimensions" : ["tier"], "type" : "count" },
      "segment/loadQueue/size" : { "dimensions" : ["server"], "type" : "gauge" },
      "segment/loadQueue/failed" : { "dimensions" : ["server"], "type" : "gauge" },
      "segment/loadQueue/count" : { "dimensions" : ["server"], "type" : "gauge" },
      "segment/dropQueue/count" : { "dimensions" : ["server"], "type" : "gauge" },
      "segment/size" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "segment/overShadowed/count" : { "dimensions" : [], "type" : "gauge" },

      "segment/max" : { "dimensions" : [], "type" : "gauge"},
      "segment/used" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" },
      "segment/usedPercent" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge", "convertRange" : true },
      "segment/pendingDelete" : { "dimensions" : [], "type" : "gauge"},

      "jvm/pool/committed" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
      "jvm/pool/init" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
      "jvm/pool/max" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
      "jvm/pool/used" : { "dimensions" : ["poolKind", "poolName"], "type" : "gauge" },
      "jvm/bufferpool/count" : { "dimensions" : ["bufferpoolName"], "type" : "gauge" },
      "jvm/bufferpool/used" : { "dimensions" : ["bufferpoolName"], "type" : "gauge" },
      "jvm/bufferpool/capacity" : { "dimensions" : ["bufferpoolName"], "type" : "gauge" },
      "jvm/mem/init" : { "dimensions" : ["memKind"], "type" : "gauge" },
      "jvm/mem/max" : { "dimensions" : ["memKind"], "type" : "gauge" },
      "jvm/mem/used" : { "dimensions" : ["memKind"], "type" : "gauge" },
      "jvm/mem/committed" : { "dimensions" : ["memKind"], "type" : "gauge" },
      "jvm/gc/count" : { "dimensions" : ["gcName", "gcGen"], "type" : "count" },
      "jvm/gc/cpu" : { "dimensions" : ["gcName", "gcGen"], "type" : "count" },

      "ingest/events/buffered" : { "dimensions" : ["serviceName", "bufferCapacity"], "type" : "gauge"},

      "sys/swap/free" : { "dimensions" : [], "type" : "gauge"},
      "sys/swap/max" : { "dimensions" : [], "type" : "gauge"},
      "sys/swap/pageIn" : { "dimensions" : [], "type" : "gauge"},
      "sys/swap/pageOut" : { "dimensions" : [], "type" : "gauge"},
      "sys/disk/write/count" : { "dimensions" : ["fsDevName"], "type" : "count"},
      "sys/disk/read/count" : { "dimensions" : ["fsDevName"], "type" : "count"},
      "sys/disk/write/size" : { "dimensions" : ["fsDevName"], "type" : "count"},
      "sys/disk/read/size" : { "dimensions" : ["fsDevName"], "type" : "count"},
      "sys/net/write/size" : { "dimensions" : [], "type" : "count"},
      "sys/net/read/size" : { "dimensions" : [], "type" : "count"},
      "sys/fs/used" : { "dimensions" : ["fsDevName", "fsDirName", "fsTypeName", "fsSysTypeName", "fsOptions"], "type" : "gauge"},
      "sys/fs/max" : { "dimensions" : ["fsDevName", "fsDirName", "fsTypeName", "fsSysTypeName", "fsOptions"], "type" : "gauge"},
      "sys/mem/used" : { "dimensions" : [], "type" : "gauge"},
      "sys/mem/max" : { "dimensions" : [], "type" : "gauge"},
      "sys/storage/used" : { "dimensions" : ["fsDirName"], "type" : "gauge"},
      "sys/cpu" : { "dimensions" : ["cpuName", "cpuTime"], "type" : "gauge"},

      "coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
      "historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" },

      "jetty/numOpenConnections" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/caffeine/total/requests" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/caffeine/total/loadTime" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/caffeine/total/evictionBytes" : { "dimensions" : [], "type" : "gauge" },
      "query/cache/memcached/total" : { "dimensions" : ["[MEM] Reconnecting Nodes (ReconnectQueue)",
        "[MEM] Request Rate: All",
        "[MEM] Average Bytes written to OS per write",
        "[MEM] Average Bytes read from OS per read",
        "[MEM] Response Rate: All (Failure + Success + Retry)",
        "[MEM] Response Rate: Retry",
        "[MEM] Response Rate: Failure",
        "[MEM] Response Rate: Success"],
        "type" : "gauge" },
      "query/cache/caffeine/delta/requests" : { "dimensions" : [], "type" : "count" },
      "query/cache/caffeine/delta/loadTime" : { "dimensions" : [], "type" : "count" },
      "query/cache/caffeine/delta/evictionBytes" : { "dimensions" : [], "type" : "count" },
      "query/cache/memcached/delta" : { "dimensions" : ["[MEM] Reconnecting Nodes (ReconnectQueue)",
        "[MEM] Request Rate: All",
        "[MEM] Average Bytes written to OS per write",
        "[MEM] Average Bytes read from OS per read",
        "[MEM] Response Rate: All (Failure + Success + Retry)",
        "[MEM] Response Rate: Retry",
        "[MEM] Response Rate: Failure",
        "[MEM] Response Rate: Success"],
        "type" : "count" }
    }

  volumeMounts:
    - mountPath: /druid/data
      name: data-volume
    - mountPath: /druid/deepstorage
      name: deepstorage-volume
  volumes:
    - name: data-volume
      emptyDir: {}
    - name: deepstorage-volume
      hostPath:
        path: /tmp/druid/deepstorage
        type: DirectoryOrCreate
  env:
    - name: POD_NAME
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: POD_NAMESPACE
      valueFrom:
        fieldRef:
          fieldPath: metadata.namespace

  nodes:
    brokers:
      # Optionally specify for running broker as Deployment
      # kind: Deployment
      nodeType: "broker"
      # Optionally specify for broker nodes
      # imagePullSecrets:
      # - name: tutu
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/query/broker"
      replicas: 1
      volumeClaimTemplates:
       - metadata:
           name: data-volume
         spec:
           accessModes:
           - ReadWriteOnce
           resources:
             requests:
               storage: 2Gi
           storageClassName: standard
      runtime.properties: |
        druid.service=druid/broker
        # HTTP server threads
        druid.broker.http.numConnections=5
        druid.server.http.numThreads=10
        # Processing threads and buffers
        druid.processing.buffer.sizeBytes=1
        druid.processing.numMergeBuffers=1
        druid.processing.numThreads=1
        druid.sql.enable=true
      extra.jvm.options: |-
        -Xmx512M
        -Xms512M

    coordinators:
      # Optionally specify for running coordinator as Deployment
      # kind: Deployment
      nodeType: "coordinator"
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/master/coordinator-overlord"
      replicas: 1
      runtime.properties: |
        druid.service=druid/coordinator

        # HTTP server threads
        druid.coordinator.startDelay=PT30S
        druid.coordinator.period=PT30S

        # Configure this coordinator to also run as Overlord
        druid.coordinator.asOverlord.enabled=true
        druid.coordinator.asOverlord.overlordService=druid/overlord
        druid.indexer.queue.startDelay=PT30S
        druid.indexer.runner.type=local
      extra.jvm.options: |-
        -Xmx512M
        -Xms512M

    historicals:
      nodeType: "historical"
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/data/historical"
      replicas: 1
      runtime.properties: |
        druid.service=druid/historical
        druid.server.http.numThreads=5
        druid.processing.buffer.sizeBytes=536870912
        druid.processing.numMergeBuffers=1
        druid.processing.numThreads=1

        # Segment storage
        druid.segmentCache.locations=[{\"path\":\"/druid/data/segments\",\"maxSize\":10737418240}]
        druid.server.maxSize=10737418240
      extra.jvm.options: |-
        -Xmx512M
        -Xms512M
          
    routers:
      nodeType: "router"
      druid.port: 8088
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/query/router"
      replicas: 1
      runtime.properties: |
        druid.service=druid/router

        # HTTP proxy
        druid.router.http.numConnections=10
        druid.router.http.readTimeout=PT5M
        druid.router.http.numMaxThreads=10
        druid.server.http.numThreads=10

        # Service discovery
        druid.router.defaultBrokerServiceName=druid/broker
        druid.router.coordinatorServiceName=druid/coordinator

        # Management proxy to coordinator / overlord: required for unified web console.
        druid.router.managementProxy.enabled=true       
      extra.jvm.options: |-
        -Xmx512M
        -Xms512M

Configuration question for running Druid with the Operator

From the druid-operator Slack channel:

What are you doing with the -Djava.io.tmpdir= configuration?
How do you handle TLS?
Using autoscaling? on which component and how?
What Kubernetes kind are you using for each component?
Using ZooKeeper-less?
Using MiddleManager-less?
What is your -Xmx , -Xms? How is it compared to the pods’ resource request/limits
Setting CPU limit?
How do you spread your pods across the cluster?
Using Karpenter? What is your Provisioner and Launch Template
What are you doing with these configurations: druid.segmentCache.locations and druid.server.maxSize
Created Service objects? for which component?

Question: Can I install the druid-operator from operatorhub via Helm?

Hi team,

I have spent some time to look for a way to install druid-operator from operatorhub via Helm. Do we happen to publish the druid-operator helm chart to elsewhere other than github? The reason is that I want to deploy druid-operator from pipeline, and I don't want to clone the git repo every time for the deployment.

Thanks

feat: Move policy/v1beta1 to policy/v1 for K8s 1.25+ compatibility

In ref: druid-io/druid-operator#334
cc @jwitko

Using additionalContainer not able to add extension libraries

To configure mysql as a metadata storage I need to add mysql-connector-java library to druid extensions, but I am not able to figure out the best way to do so. Tried using additonalContainer as below but its failing to start the container.

additionalContainer:
- containerName: download-mysql-connector
image: apache/druid:25.0.0
command: ["sh", "-c", "wget -O /tmp/mysql-connector-j-8.0.32.tar.gz https://downloads.mysql.com/archives/get/p/3/file/mysql-connector-j-8.0.32.tar.gz && cd /tmp && tar -xf /tmp/mysql-connector-j-8.0.32.tar.gz && cp /tmp/mysql-connector-j-8.0.32/mysql-connector-j-8.0.32.jar /opt/druid/extensions/mysql-metadata-storage/mysql-connector-java.jar"]
volumeMounts:
- name: mysql-connector-jar
mountPath: /opt/druid/extensions/mysql-connector

If there is a way we can only add initiContainer by nodes, that way we don't have to add libraries to all containers, but only to specific services.

Druid components autoscaling best practices

Started in this slack thread

We need an answer for how to scale each component of Druid.
Middle Managers are on the way to becoming dynamically provisioned which will solve this.
The biggest problem is autoscaling historicals where we should also take storage into our calculation.
Should the operator handle that? Should we have a smart third-party auto scaler (like KEDA)?

Admission Controller Webhook for complex object validation

Object validation should not be done inside the Reconcile function. As some logic moved info Kubebuilder markers, some object validation is still left in the verifyDruidSpec function.

Validations should be done via Admission Controller Webhook.

If we agree, I'll take that as soon as the Kubebuilder v3 migration will end.

PVC metadata not created issue

https://kubernetes.slack.com/archives/C04F4M6HT2L/p1681241598236429

Context is not being passed through functions

There are lots of context.TODO() in the code which is not suitable for production. We need to pass the context we get from the Reconcile function.

K8s v1.26 is not supported

We have upgraded our K8s cluster to v1.26. We had set up the druid operator v1.0.0 on K8s v1.25 for testing and it was working fine even on v1.26 but now when we tried to setup again it's failing because of hpa API version.

Druid-operator / K8s compatibility matrix shows druid operator v1.0.0 works above K8s v1.25, It seems to be ambiguous.

How can we run Druid on K8s v1.26? We are completely blocked now.

Any help would really be appreciated, Thanks in advance.

Additional ConfigMap For HDFS and Core site xml

I want to setup a druid cluster which is using HDFS for deep storage. As documented in Druid documentation, I need to add core-site.xml and hdfs-site.xml files in the Druid classpath.

I searched in documentation if it's possible to add ConfigMap to be mounted in /opt/druid/conf/druid/cluster/_common but I didn't find any spec about this. Is it something doable ?

Add ability to set priority of druid pods

Some of our druid deployments are running on mixed workload nodes and we're currently unable to set the priorityClassName as per https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/

This means periodically our druid pods can get evicted, and whilst rare, is of course undesirable.

Could we add functionality to the operator to set the pod priority?

Not able to see PVC in CR Status

Not able to see PVC in CR Status.
Here is the list of PVC:

$ kubectl get pvc
NAME                                           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-volume-druid-tiny-cluster-historicals-0   Bound    pvc-eab29f7e-73ec-4b92-ac7a-6f2add38647f   2Gi        RWO            standard       10m
data-volume-druid-tiny-cluster-historicals-1   Bound    pvc-8cd53678-49e1-4eca-a2e6-b62a954ffc40   2Gi        RWO            standard       10m

CR status:

status:
  configMaps:
  - druid-tiny-cluster-brokers-config
  - druid-tiny-cluster-coordinators-config
  - druid-tiny-cluster-historicals-config
  - druid-tiny-cluster-routers-config
  - tiny-cluster-druid-common-config
  deployments:
  - druid-tiny-cluster-brokers
  druidNodeStatus:
    druidNode: All
    druidNodeConditionStatus: "True"
    druidNodeConditionType: DruidClusterReady
    reason: All Druid Nodes are in Ready Condition
  hpAutoscalers:
  - druid-tiny-cluster-brokers
  ingress:
  - druid-tiny-cluster-routers
  podDisruptionBudgets:
  - druid-tiny-cluster-brokers
  pods:
  - druid-tiny-cluster-brokers-f58678f48-snvbd
  - druid-tiny-cluster-coordinators-0
  - druid-tiny-cluster-historicals-0
  - druid-tiny-cluster-historicals-1
  - druid-tiny-cluster-routers-0
  services:
  - druid-tiny-cluster-brokers
  - druid-tiny-cluster-coordinators
  - druid-tiny-cluster-historicals
  - druid-tiny-cluster-routers
  statefulSets:
  - druid-tiny-cluster-coordinators
  - druid-tiny-cluster-historicals
  - druid-tiny-cluster-routers

Default readiness on historicals breaks operator.

In pr #72 , defaults probes are set for various components.

This breaks the operator because while /druid/historical/v1/readiness is just fine to call /druid/historical/v1/loadstatus is a privileged call that requires authentication. Since it places this readinessProbe by default if you don't have one, the only way to disable it is to hard wire one yourself to another call.

curl -vvv http://127.0.0.1:4124/druid/historical/v1/readiness
HTTP/1.1 200 OK

curl -vvv http://127.0.0.1:4124/druid/historical/v1/loadstatus
HTTP/1.1 401 Unauthorized

Support node-specific startupProbes

Since operator 1.12.0 the setup of nodespecific nodes doesn't work any more. In my example the config of a coordinator looks the following:

nodes:
    coordinator:
      nodeType: "coordinator"
      druid.port: 8281
      nodeConfigMountPath: "/opt/druid/conf/druid/cluster/master/coordinator-overlord"
      replicas: 2
      podManagementPolicy: OrderedReady
      updateStrategy:
        type: RollingUpdate
      livenessProbe:
        failureThreshold: 60
        periodSeconds: 10
        httpGet:
          path: /status/health
          port: 8281
          scheme: HTTPS
      readinessProbe:
        failureThreshold: 60
        periodSeconds: 10
        httpGet:
          path: /status/health
          port: 8281
          scheme: HTTPS
      runtime.properties: |
        druid.service=druid/coordinator
        ....

As you can see the cluster has TLS enabled and due to that I've used the default TLS Ports so for coordinator it's 8281, for router 9088, historical 8283, etc.

With operator <=1.1.1 this works fine. Since Operator 1.2.0 the default_probes #98 came in place and the node-specific probes config stopped working.

Switching off the Defaultprobes with spec.defaultProbes: false a default startupProbe with will be set on the coordinator pod (via statefulset of course):

      livenessProbe:
        httpGet:
          path: /status/health
          port: 8281
          scheme: HTTPS
        timeoutSeconds: 1
        periodSeconds: 10
        successThreshold: 1
        failureThreshold: 60
      readinessProbe:
        httpGet:
          path: /status/health
          port: 8281
          scheme: HTTPS
        timeoutSeconds: 1
        periodSeconds: 10
        successThreshold: 1
        failureThreshold: 60
      startupProbe:
        httpGet:
          path: /status/health
          port: 8281
          scheme: HTTP
        initialDelaySeconds: 5
        timeoutSeconds: 5
        periodSeconds: 10
        successThreshold: 1
        failureThreshold: 10

The latest CRD doesn't allow to set node-specific startupProbes:

failed to create typed patch object (druid/fqmdruid; druid.apache.org/v1alpha1, Kind=Druid): .spec.nodes.coordinator.startupProbe: field not declared in schema

The changerequest would be, to allow the setup of node-specific startupProbes in the CRD.

Fail to create controller manager container

commit 2a5d9f5
Go version 1.20
K8s version 1.23.12

I tried to use master branch to deploy Druid on my company dev K8S cluster.

I used make deploy to deploy
And I found that manager container keep the status on CreateContainerConfigError.
I tried to describe the controller manger and find below error.

I have got an advice on slack channel and suggest fix it with the security annotation on the pod.
Then, I tried to change

druid-operator/config/manager/manager.yaml

securityContext:
   runAsUser: 1000
   runAsNonRoot: true

When I applied new security config, I got a new error message in the manger container. I still can't success create manger container.

flag provided but not defined: -metrics-bind-address                                                    │
│ Usage of /manager:                                                                     │
│  -enable-leader-election                                                                 │
│     Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.               │
│  -health-probe-bind-address string                                                            │
│     The address the probe endpoint binds to. (default ":8081")                                             │
│  -kubeconfig string                                                                    │
│     Paths to a kubeconfig. Only required if out-of-cluster.                                              │
│  -metrics-addr string                                                                   │
│     The address the metric endpoint binds to. (default ":8080")

Scale PVC for multi-tier component is broken

I noticed on my cluster that the operator is trying to resize PVCs (and failing).

We have 2 tier of historicals, called histolder and histrecent,
However, they are the right size, but it seems it's mixing the storage specs for the PVC : it's trying to resize the histolder PVC to the histrecent size...

It seems the pvcLabels here are too loose for matching the PVCs:

pvcLabels := map[string]string{
		"component": nodeSpec.NodeType,
	}

	pvcList, err := readers.List(ctx, sdk, drd, pvcLabels, emitEvent, func() objectList { return &v1.PersistentVolumeClaimList{} }, func(listObj runtime.Object) []object {
		items := listObj.(*v1.PersistentVolumeClaimList).Items
		result := make([]object, len(items))
		for i := 0; i < len(items); i++ {
			result[i] = &items[i]
		}
		return result
	})
	if err != nil {
		return nil
	}

It matches all historical nodes, regardless of the node name.

Here is a excerpt of my config:

  scalePvcSts: true
  nodes:
    histrecent:
      nodeType: "historical"
      #...
      volumeClaimTemplates:
        - metadata:
            name: data
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 400Gi
            storageClassName: csi-cinder-high-speed
      volumeMounts:
        - mountPath: /druid/data
          name: data
      volumes:
        - name: data
          emptyDir: {}

    histolder:
      nodeType: "historical"
      #...
      volumeClaimTemplates:
        - metadata:
            name: data
          spec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 1200Gi
            storageClassName: csi-cinder-classic
      volumeMounts:
        - mountPath: /druid/data
          name: data
      volumes:
        - name: data
          emptyDir: {}

And here is the bit of operator log that caught my eye:

2023-09-11T15:04:22Z    ERROR   Reconciler error        {"controller": "druid", "controllerGroup": "druid.apache.org", "controllerKind": "Druid", "Druid": {"name":"x-cluster","namespace":"default"}, "namespace": "default", "name": "x-cluster", "reconcileID": "f8be13e6-fd34-428a-8f14-e3916e2e3112", "error": "PersistentVolumeClaim \"data-druid-x-cluster-histrecent-0\" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims\n  core.PersistentVolumeClaimSpec{\n  \tAccessModes: {\"ReadWriteOnce\"},\n  \tSelector:    nil,\n  \tResources: core.ResourceRequirements{\n  \t\tLimits:   nil,\n- \t\tRequests: core.ResourceList{s\"storage\": {i: resource.int64Amount{value: 429496729600}, Format: \"BinarySI\"}},\n+ \t\tRequests: core.ResourceList{s\"storage\": {i: resource.int64Amount{value: 1288490188800}, Format: \"BinarySI\"}},\n  \t},\n  \tVolumeName:       \"\",\n  \tStorageClassName: &\"csi-cinder-high-speed\",\n  \t... // 3 identical fields\n  }\n"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
2023-09-11T15:04:22Z    DEBUG   events  Error patching object [data-druid-x-cluster-histrecent-0:*v1.PersistentVolumeClaim] in namespace [default] due to [PersistentVolumeClaim "data-druid-x-cluster-histrecent-0" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
  core.PersistentVolumeClaimSpec{
        AccessModes: {"ReadWriteOnce"},
        Selector:    nil,
        Resources: core.ResourceRequirements{
                Limits:   nil,
-               Requests: core.ResourceList{s"storage": {i: resource.int64Amount{value: 429496729600}, Format: "BinarySI"}},
+               Requests: core.ResourceList{s"storage": {i: resource.int64Amount{value: 1288490188800}, Format: "BinarySI"}},
        },
        VolumeName:       "",
        StorageClassName: &"csi-cinder-high-speed",
        ... // 3 identical fields
  }

Here are 2 successive (few minute interval) output from k get pvc:

data-druid-x-cluster-histolder-8          Bound    xxx     RWO            csi-cinder-classic      2m7s
data-druid-x-cluster-histolder-9          Bound    xxx   1200Gi     RWO            csi-cinder-classic      2m6s
data-druid-x-cluster-histrecent-0         Bound    xxx   400Gi      RWO            csi-cinder-high-speed   2m8s
data-druid-x-cluster-histrecent-1         Bound    xxx   400Gi      RWO            csi-cinder-high-speed   2m8s

data-druid-x-cluster-histolder-8          Bound    xxx   1200Gi     RWO            csi-cinder-classic      25m
data-druid-x-cluster-histolder-9          Bound    xxx   1200Gi     RWO            csi-cinder-classic      25m
data-druid-x-cluster-histrecent-0         Bound    xxx   1200Gi     RWO            csi-cinder-high-speed   25m
data-druid-x-cluster-histrecent-1         Bound    xxx   1200Gi     RWO            csi-cinder-high-speed   25m

As you can see, it ended up scaling all PVCs to 1200Gi because it keeps mixing up between the two values.

support Pod topology spread constraints

Support Kubernetes Pod topology spread constraints.

https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#pod-topology-spread-constraints

Cut Release compatible with K8s Release 1.25

Update go.mod
Fixes to api versions

[Proposal] Ingestion Spec controller

In ref: Proposal: Ingestion Spec controller druid-io/druid-operator#313

Compatibility e2e tests

We're missing the visibility of the combinations of these:

version of the operator
version of Druid
version of Kubernetes

We need to create an e2e test that will run on every supported combination

Support TLS Certificates

While looking at the Druid CRD, I don't see any information on how we can pass a CA certificate to the operator.

Either via a K8s Secret or $CLOUD provider method to get the secret passed in.

I was thinking of using an ExternalSecret --> $CLOUD provider secret manager and then referencing the Secret via the Operator but no such luck.

Is this correct?

Thanks,
Shawn

Coordinator is not created by druid operator

We recently performed an upgrade of the Druid operator from version 1.0.0 to version 1.2.0, and during the process, we encountered an issue when attempting to create a new Druid cluster. It's worth noting that there were no changes made to the cluster manifest.

The specific problem we encountered was the absence of a coordinator created by the Druid operator. Upon inspecting the resource list, we noticed that there was no coordinator statefulset present. Strangely, there were no error messages recorded in the Druid operator log. This issue appears to be intermittent, as we have successfully used the Druid operator to create multiple clusters without encountering this problem, and it was only observed in one particular cluster.

Additionally, we observed that the Druid operator log does not seem to contain particularly useful information, and there is a lack of valuable info in the pod logs.

Nil pointer dereference when storage class name not set

In ref: druid-io/druid-operator#279
cc @applike-ssNil pointer dereference when storage class name not set #10

Question: Kubebuilder structure

Why did you choose to move from Kubebuilder's default directories structure? As soon as you need to add a new API, you will need to make lots of changes in order to fit the current custom structure. I'm mainly talking about the deploy directory, but the Makefile also.

Welcome Cyril as Collaborator

@cyril-corbon is one of the core contributors to the project and it active in helping the community. He has also helped in evangelising the druid operator in CNCF community events.
Welcome @cyril-corbon as a collaborator, it is a pleasure to have you.

Supports deploying sidecars to individual druid service's pod rather than to all pods

I need to deploy a sidecar container on my broker, but the operator seems to configure it to only deploy to all pods.

I would like to see support for deploying sidecars to individual pods.

helm: Keep CRD by default so druid clusters are not automatically removed

In reference with PR druid-io/druid-operator#331
cc @jwitko

Ability to add custom files in _common directory

We currently can't add other files to the CommonConfigMountPath path.

I thought about mounting files as subPath, but Kubernetes documentation states, A container using a ConfigMap as a [subPath](https://kubernetes.io/docs/concepts/storage/volumes/#using-subpath) volume will not receive ConfigMap updates.

My suggestion is to add a new field:

// References to ConfigMaps holding more files to mount to the CommonConfigMountPath.
// +optional
ExtraCommonConfig []*v1.ObjectReference `json:"extraCommonConfig"`

It will give customers the flexibility to create and arrange ConfigMaps with extra files inside them, and the operator will mount them together with the CommonRuntimeProperties.

This will be done by changing the makeCommonConfigMap function to also attach the files in those extra config maps.

Support Hadoop Indexing

Not sure I'm right, but in order to support Hadoop indexing (at least what I have in my company) are the following files under /opt/druid/conf/druid/cluster/_common/:

capacity-scheduler.xml
common.runtime.properties
core-site.xml
hadoop-policy.xml
hdfs-site.xml
hive-site.xml
httpfs-site.xml
kms-acls.xml
kms-site.xml
mapred-site.xml
metric-dimensions.json
yarn-site.xml

I don't think we should support them in CRD, We should have the ability to mount configMaps and Secrets as files under /opt/druid/conf/druid/cluster/_common/.

Document Running Druid Without MM on K8s

@churromorales if you can help in here. :)

[proposal] Setup default probe for each nodes types

Context & goal

All the druid components deployed by the operator are deployed by default without probe.
IMHO an operator have to configure in advance this kind of settings.

Probes have to be configured for each node type and override if an user has defined it.

all the different probes have to use the druid api reference

Probe definition

coordinator, overlord, middlemanager and router

      livenessProbe:
        httpGet:
          path: /status/health
          port: $druid.port
        failureThreshold: 20
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5
      readinessProbe:
        httpGet:
          path: /status/health
          port: $druid.port
        failureThreshold: 10
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5

broker

      livenessProbe:
        httpGet:
          path: /status/health
          port: $druid.port
        failureThreshold: 20
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5
      readinessProbe:
        httpGet:
          path: /druid/broker/v1/readiness
          port: $druid.port
        failureThreshold: 10
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5

historical

      livenessProbe:
        httpGet:
          path: /status/health
          port: $druid.port
        failureThreshold: 20
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5
      readinessProbe:
        httpGet:
          path: /status/health
          port: $druid.port
        failureThreshold: 10
        initialDelaySeconds: 5
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 5
      startUpProbes:
        httpGet:
          path: /druid/historical/v1/loadstatus
          port: $druid.port
        failureThreshold: 20
        initialDelaySeconds: 180
        periodSeconds: 30
        successThreshold: 1
        timeoutSeconds: 10

publish v1.0.0 to charts index

Kubernetes clusters running version <=1.25 will need app version 1.0.0 (charts v0.2.0) available on public helm repo index.
https://charts.datainfra.io/index.yaml

Currently, only v1.1.1 and v1.1.0 are available for helm pull.

Fix e2e tests

somehow the e2e tests are using the wrong crd definition. cc @avtarOPS

Support middleManager graceful termination

As per https://druid.apache.org/docs/latest/operations/rolling-updates/, would it be possible for druid operator to support the Rolling restart (graceful-termination-based)?

[feature] add support for audit logging for Druid tenants

cc @nitisht

Support metadata_storage credentials as secret

Currently, the CRD holds my metadata storage username and password in plain text. I want us to find the best solution for securing it.

Testing migration to Kubebuilder native framework

As part of the migration from operatorSDK to Kubebuilder, we also need to refactor the tests into Kuebuilder's framework and structure in order to be aligned with the project.
That means generating the suite_test.go from kubebuilder and adding our currents tests using the gomega and ginkgo frameworks