Coder Social home page Coder Social logo

Production ready? about activemq-artemis-helm HOT 5 CLOSED

vromero avatar vromero commented on July 19, 2024
Production ready?

from activemq-artemis-helm.

Comments (5)

vromero avatar vromero commented on July 19, 2024

I wouldn't use it in production. A number of reasons:

  • haven't decided if generate config or use KUBEPING
  • artemis can't handle dynamic cluster sizes (the cluster with static size has to be formed on start), I have no idea what to do about this.
  • haven't completed the integration with prometheus, a messaging broker without metrics/alarms is more a problem than a solution
  • Not sure what to do about the loadbalancing. Today slave is not-ready but not ready messes up with things like helm install --wait or with deployment of replica>1 stateful sets. No idea yet what to do about this.

from activemq-artemis-helm.

DanSalt avatar DanSalt commented on July 19, 2024

Hi @vromero, (and @somejavadev for info)

Agree with your assessment, but it's close! If it helps, we've been using a modified version of your charts in our environment, with the aim of taking them to production. We have a number of changes, and at some point I'll aim to fold them into a PR for you to take a look at.

A few comments on your points above:

  • We have dynamic clustering working in Artemis, with a couple of caveats. It's currently using static connectors (as per your latest changes), which means that the set of nodes used for discovery is fixed. Artemis nodes do keep the whole cluster state in-memory, but use the static connectors to determine the cluster topology. If you scale up the cluster, the new nodes have larger lists of static connectors (e.g. references to ALL nodes), but the existing nodes only know about the ones it defined at deploy-time. Scaling down is more of an issue, because you're taking away nodes that might then be called upon to get cluster topology. But as long as each node has at least one available node in its list, discovery works 'good enough'. The restart time on pods is pretty small too, so depending on your use case and number of nodes, it doesn't cause too much problem.

  • Your Docker image was pretty Prometheus-ready, to be honest. All we had to do in the charts was enable the JMX_EXPORTER and create a ServiceMonitor for Prometheus Operator to scrape it. We have created a neat Grafana dashboard that shows all the instances and their important data. I much prefer this to using the ActiveMQ console because hooking up all the individual console instances via Ingress is a pain - would be better if the AMQ Console could connect to all the other nodes (it can show them in the cluster diagram, so theoretically possible)

  • Load Balancing was an interesting one, and we 'fixed' this by (a) telling Kubernetes not to wait for unready endpoints on slave nodes only and (b) changing the readiness probe to the 'core' endpoint, not the console. This way (1) the Slave nodes remain not-ready, and this excludes them from the load balancer/DNS (which is what we want). As soon as a node drops and the slave becomes ready, it is included in DNS/load balancing. The only annoyance is that by probing the core endpoint, it causes log entries for badly terminated connections. Whilst this isn't a perfect fix, it's good enough.

  • Finally, we do have a prototype of a version of the charts that uses JGroups for dynamic discovery (backed by DB), but there are a number of worrying version mis-matches of JGroups and KUBEPING that prevent us from going fully down this route. Once the versions align better, we may resume this path.

Hope all this helps.

Cheers,
Dan

from activemq-artemis-helm.

vromero avatar vromero commented on July 19, 2024

This sounds awesome, I'm looking forward to seeing the PR, Feel free to drop small PRs whenever you feel like it, no need to wait for a big thing.

from activemq-artemis-helm.

azman0101 avatar azman0101 commented on July 19, 2024

Is there any update on this project production readiness status ?

from activemq-artemis-helm.

vromero avatar vromero commented on July 19, 2024

I'm afraid not. I keep playing in my head with this and even if with the fantastic insights from @DanSalt I stilll believe the clustering model of artemis does not play well with K8s. And hence, I'd probably end up reducing the chart to a master-slave configuration (which is what Artemis does anyways in a >2 cluster, eit just picks two nodes to be master and (a single) slave). I'd also get rid of the load balancer, as the model in artemis expects you to know the master and slave directios and it just plays awfully bad with the K8s loadbalancers.
That is probably the only thing still missing. I'd probably also add some example grafana dashboards and a great deal of documentation.

from activemq-artemis-helm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.