Coder Social home page Coder Social logo

emqx / ekka Goto Github PK

View Code? Open in Web Editor NEW
97.0 29.0 49.0 841 KB

Autocluster and Autoheal for EMQX Broker

Home Page: https://www.emqx.com

License: Apache License 2.0

Makefile 0.55% Erlang 99.29% Shell 0.16%
erlang-library autocluster autoheal clustering etcd kubernetes

ekka's Introduction

Ekka

Ekka - Autocluster for EMQX Broker. Ekka helps building a new distribution layer for EMQ X R2.3+.

----------             ----------
|  EMQX  |<--- MQTT--->|  EMQX  |
|--------|             |--------|
|  Ekka  |<----RPC---->|  Ekka  |
|--------|             |--------|
| Mnesia |<--Cluster-->| Mnesia |
|--------|             |--------|
| Kernel |<----TCP---->| Kernel |
----------             ----------

Node discovery and Autocluster

Ekka supports erlang node discovery and autocluster using various strategies:

Strategy Description
manual Join cluster manually
static Static node list
dns DNS A Records
etcd etcd
k8s Kubernetes

The configuration example files are under 'etc/' folder.

Cluster using static node list

Cuttlefish style config:

cluster.discovery = static

cluster.static.seeds = [email protected],[email protected]

Erlang config:

{cluster_discovery,
  {static, [
    {seeds, ['[email protected]', '[email protected]']}
  ]}},

Cluster using DNS A records

Cuttlefish style config:

cluster.discovery = dns

## DNS name.
##
## Value: String
cluster.dns.name = localhost

## The App name is used to build 'node.name' with IP address.
##
## Value: String
cluster.dns.app = ekka

Erlang config:

{cluster_discovery,
  {dns, [
    {name, "localhost"},
    {app, "ekka"}
  ]}},

Cluster using etcd

Cuttlefish style config:

cluster.discovery = etcd

## Etcd server list, seperated by ','.
##
## Value: String
cluster.etcd.server = http://127.0.0.1:2379

## The prefix helps build nodes path in etcd. Each node in the cluster
## will create a path in etcd: v2/keys/<prefix>/<cluster.name>/<node.name>
##
## Value: String
cluster.etcd.prefix = ekkacl

## The TTL for node's path in etcd.
##
## Value: Duration
##
## Default: 1m, 1 minute
cluster.etcd.node_ttl = 1m

## Path to a file containing the client's private PEM-encoded key.
##
## Value: File
##
## cluster.etcd.keyfile = {{platform_etc_dir}}/certs/client-key.pem

## Path to the file containing the client's certificate
##
## Value: File
##
## cluster.etcd.certfile = {{platform_etc_dir}}/certs/client.pem

## Path to the file containing PEM-encoded CA certificates. The CA certificates
## are used during server authentication and when building the client certificate chain.
##
## Value: File
##
## cluster.etcd.cacertfile = {{platform_etc_dir}}/certs/ca.pem

Erlang config:

{cluster_discovery,
  {etcd, [
    {server, ["http://127.0.0.1:2379"]},
    {prefix, "ekkacluster"},
    %%{ssl_options, [
    %%    {keyfile, "path/to/client-key.pem"},
    %%    {certfile, "path/to/client.pem"},
    %%    {cacertfile, "path/to/ca.pem"}
    %%]},
    {node_ttl, 60000}
  ]}},

Cluster using Kubernates

Cuttlefish style config:

cluster.discovery = k8s

## Kubernates API server list, seperated by ','.
##
## Value: String
## cluster.k8s.apiserver = http://10.110.111.204:8080

## The service name helps lookup EMQ nodes in the cluster.
##
## Value: String
## cluster.k8s.service_name = ekka

## The name space of k8s
##
## Value: String
## cluster.k8s.namespace = default

## The address type is used to extract host from k8s service.
##
## Value: ip | dns | hostname
## cluster.k8s.address_type = ip

## The app name helps build 'node.name'.
##
## Value: String
## cluster.k8s.app_name = ekka

## The suffix added to dns and hostname get from k8s service
##
## Value: String
## cluster.k8s.suffix = pod.cluster.local

Erlang config:

{cluster_discovery,
  {k8s, [
    {apiserver, "http://10.110.111.204:8080"},
    {namespace, "default"},
    {service_name, "ekka"},
    {address_type, ip},
    {app_name, "ekka"},
    {suffix, "pod.cluster.local"}
  ]}}

Network partition and Autoheal

Autoheal Design

When network partition occurs, the following steps to heal the cluster if autoheal is enabled:

  1. Node reports the partitions to a leader node which has the oldest guid.

  2. Leader node create a global netsplit view and choose one node in the majority as coordinator.

  3. Leader node requests the coordinator to autoheal the network partition.

  4. Coordinator node reboots all the nodes in the minority side.

Enable autoheal

Erlang config:

{cluster_autoheal, true},

Cuttlefish style config:

cluster.autoheal = on

Lock Service

Ekka implements a simple distributed lock service in 0.3 release. The Lock APIs:

Acquire lock:

-spec(acquire(resource()) -> {boolean(), [node()]}).
ekka_locker:acquire(Resource).

-spec(acquire(atom(), resource(), lock_type()) -> lock_result()).
ekka_locker:acquire(ekka_locker, Resource, Type).

Release lock:

-spec(release(resource()) -> lock_result()).
ekka_locker:release(Resource).

-spec(release(atom(), resource()) -> lock_result()).
ekka_locker:release(Name, Resource).

The lock type:

-type(lock_type() :: local | leader | quorum | all).

Cluster without epmd

The ekka 0.6.0 release implements erlang distribiton without epmd.

See: http://erlang.org/pipermail/erlang-questions/2015-December/087013.html

For example:

## Dist port: 4370
erl -pa ebin -pa _build/default/lib/*/ebin -proto_dist ekka -start_epmd false -epmd_module ekka_epmd -name [email protected] -s ekka
## Dist port: 4371
erl -pa ebin -pa _build/default/lib/*/ebin -proto_dist ekka -start_epmd false -epmd_module ekka_epmd -name [email protected] -s ekka
## Dist port: 4372
erl -pa ebin -pa _build/default/lib/*/ebin -proto_dist ekka -start_epmd false -epmd_module ekka_epmd -name [email protected]  -s ekka

The erlang distribution port can be tuned by ekka inet_dist_base_port env. The default port is 4370.

License

Apache License Version 2.0

Author

EMQX Team.

ekka's People

Contributors

dddhuang avatar emqplus avatar gilbertwong96 avatar gorillainduction avatar hjianbo avatar huangdan avatar id avatar iequ1 avatar k32 avatar keynslug avatar lafirest avatar linjunjj avatar qzhuyan avatar rory-z avatar savonarola avatar sergetupchiy avatar terry-xiaoyu avatar thalesmg avatar tigercl avatar turtledeng avatar velimir avatar zhiw avatar zmstone avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ekka's Issues

auto cluster not working on kubernetes

I'm running emqtt on k8s using the following docker image: https://github.com/emqtt/emq-docker

In each pod, the generated config is like this:

##===================================================================
## EMQ Configuration R2.3
##===================================================================

##--------------------------------------------------------------------
## Cluster
##--------------------------------------------------------------------

## Cluster name
cluster.name = emqcl

## Cluster discovery strategy: manual | static | mcast | dns | etcd | k8s
cluster.discovery = k8s

## Cluster Autoheal: on | off
cluster.autoheal = on

## Clean down node of the cluster
cluster.autoclean = 5m

##--------------------------------------------------------------------
## Cluster with static node list

## cluster.static.seeds = [email protected],[email protected]

##--------------------------------------------------------------------
## Cluster with multicast

## cluster.mcast.addr = 239.192.0.1

## cluster.mcast.ports = 4369,4370

## cluster.mcast.iface = 0.0.0.0

## cluster.mcast.ttl = 255

## cluster.mcast.loop = on

##--------------------------------------------------------------------
## Cluster with DNS

## cluster.dns.name = localhost

## cluster.dns.app = emq

##--------------------------------------------------------------------
## Cluster with Etcd

## cluster.etcd.server = http://127.0.0.1:2379

## cluster.etcd.prefix = emqcl

## cluster.etcd.node_ttl = 1m

##--------------------------------------------------------------------
## Cluster with k8s

cluster.k8s.apiserver = https://kubernetes.default:443

cluster.k8s.service_name = emqtt

## Address Type: ip | dns
cluster.k8s.address_type = ip

## The Erlang application name
cluster.k8s.app_name = emqtt

##--------------------------------------------------------------------
## Node Args
##--------------------------------------------------------------------

## Node name
node.name = emqtt

## Cookie for distributed node
node.cookie = secretcookieeeetfggggg

## SMP support: enable, auto, disable
node.smp = auto

## vm.args: -heart
## Heartbeat monitoring of an Erlang runtime system
## Value should be 'on' or comment the line
## node.heartbeat = on

## Enable kernel poll
node.kernel_poll = on

## async thread pool
node.async_threads = 32

## Erlang Process Limit
node.process_limit = 2097152

## Sets the maximum number of simultaneously existing ports for this system
node.max_ports = 1048576

## Set the distribution buffer busy limit (dist_buf_busy_limit)
node.dist_buffer_size = 32MB

## Max ETS Tables.
## Note that mnesia and SSL will create temporary ets tables.
node.max_ets_tables = 2097152

## Tweak GC to run more often
node.fullsweep_after = 1000

## Crash dump
node.crash_dump = log/crash.dump

## Distributed node ticktime
node.dist_net_ticktime = 60

## Distributed node port range
node.dist_listen_min = 6369
node.dist_listen_max = 6379

##--------------------------------------------------------------------
## Log
##--------------------------------------------------------------------

## Set the log dir
log.dir = log

## Console log. Enum: off, file, console, both
log.console = console

## Console log level. Enum: debug, info, notice, warning, error, critical, alert, emergency
log.console.level = error

## Syslog. Enum: on, off
log.syslog = on

##  syslog level. Enum: debug, info, notice, warning, error, critical, alert, emergency
log.syslog.level = error

## Console log file
## log.console.file = log/console.log

## Info log file
## log.info.file = log/info.log

## Error log file
log.error.file = log/error.log

## Enable the crash log. Enum: on, off
log.crash = on

log.crash.file = log/crash.log

##--------------------------------------------------------------------
## Allow Anonymous and Default ACL
##--------------------------------------------------------------------

## Allow Anonymous authentication
mqtt.allow_anonymous = true

## ACL nomatch
mqtt.acl_nomatch = allow

## Default ACL File
mqtt.acl_file = etc/acl.conf

## Cache ACL for PUBLISH
mqtt.cache_acl = true

##--------------------------------------------------------------------
## MQTT Protocol
##--------------------------------------------------------------------

## Max ClientId Length Allowed.
mqtt.max_clientid_len = 1024

## Max Packet Size Allowed, 64K by default.
mqtt.max_packet_size = 64KB

## Check Websocket Protocol Header. Enum: on, off
mqtt.websocket_protocol_header = on

##--------------------------------------------------------------------
## MQTT Connection
##--------------------------------------------------------------------

## Force GC: integer. Value 0 disabled the Force GC.
mqtt.conn.force_gc_count = 100

##--------------------------------------------------------------------
## MQTT Client
##--------------------------------------------------------------------

## Client Idle Timeout (Second)
mqtt.client.idle_timeout = 30s

## Max publish rate of Messages
## mqtt.client.max_publish_rate = 5

## Enable client Stats: on | off
mqtt.client.enable_stats = off

##--------------------------------------------------------------------
## MQTT Session
##--------------------------------------------------------------------

## Max Number of Subscriptions, 0 means no limit.
mqtt.session.max_subscriptions = 0

## Upgrade QoS?
mqtt.session.upgrade_qos = off

## Max Size of the Inflight Window for QoS1 and QoS2 messages
## 0 means no limit
mqtt.session.max_inflight = 32

## Retry Interval for redelivering QoS1/2 messages.
mqtt.session.retry_interval = 20s

## Client -> Broker: Max Packets Awaiting PUBREL, 0 means no limit
mqtt.session.max_awaiting_rel = 100

## Awaiting PUBREL Timeout
mqtt.session.await_rel_timeout = 20s

## Enable Statistics: on | off
mqtt.session.enable_stats = off

## Expired after 1 day:
## w - week
## d - day
## h - hour
## m - minute
## s - second
mqtt.session.expiry_interval = 2h

## Ignore message from self publish
mqtt.session.ignore_loop_deliver = false

##--------------------------------------------------------------------
## MQTT Message Queue
##--------------------------------------------------------------------

## Type: simple | priority
mqtt.mqueue.type = simple

## Topic Priority: 0~255, Default is 0
## mqtt.mqueue.priority = topic/1=10,topic/2=8

## Max queue length. Enqueued messages when persistent client disconnected,
## or inflight window is full. 0 means no limit.
mqtt.mqueue.max_length = 1000

## Low-water mark of queued messages
mqtt.mqueue.low_watermark = 20%

## High-water mark of queued messages
mqtt.mqueue.high_watermark = 60%

## Queue Qos0 messages?
mqtt.mqueue.store_qos0 = true

##--------------------------------------------------------------------
## MQTT Broker and PubSub
##--------------------------------------------------------------------

## System Interval of publishing broker $SYS Messages
mqtt.broker.sys_interval = 60

## PubSub Pool Size. Default should be scheduler numbers.
mqtt.pubsub.pool_size = 8

mqtt.pubsub.by_clientid = true

## Subscribe Asynchronously
mqtt.pubsub.async = true

##--------------------------------------------------------------------
## MQTT Bridge
##--------------------------------------------------------------------

## Bridge Queue Size
mqtt.bridge.max_queue_len = 10000

## Ping Interval of bridge node. Unit: Second
mqtt.bridge.ping_down_interval = 1

##-------------------------------------------------------------------
## MQTT Plugins
##-------------------------------------------------------------------

## Dir of plugins' config
mqtt.plugins.etc_dir =etc/plugins/

## File to store loaded plugin names.
mqtt.plugins.loaded_file = data/loaded_plugins

##--------------------------------------------------------------------
## MQTT Listeners
##--------------------------------------------------------------------

##--------------------------------------------------------------------
## External TCP Listener

## External TCP Listener: 1883, 127.0.0.1:1883, ::1:1883
listener.tcp.external = 0.0.0.0:1883

## Size of acceptor pool
listener.tcp.external.acceptors = 64

## Maximum number of concurrent clients
listener.tcp.external.max_clients = 1000000

#listener.tcp.external.mountpoint = external/

## Rate Limit. Format is 'burst,rate', Unit is KB/Sec
#listener.tcp.external.rate_limit = 100,10

#listener.tcp.external.access.1 = allow 192.168.0.0/24

listener.tcp.external.access.2 = allow all

## Proxy Protocol V1/2
## listener.tcp.external.proxy_protocol = on
## listener.tcp.external.proxy_protocol_timeout = 3s

## TCP Socket Options
listener.tcp.external.backlog = 1024

#listener.tcp.external.recbuf = 4KB

#listener.tcp.external.sndbuf = 4KB

listener.tcp.external.buffer = 4KB

listener.tcp.external.nodelay = true

##--------------------------------------------------------------------
## Internal TCP Listener

## Internal TCP Listener: 11883, 127.0.0.1:11883, ::1:11883
listener.tcp.internal = 127.0.0.1:11883

## Size of acceptor pool
listener.tcp.internal.acceptors = 16

## Maximum number of concurrent clients
listener.tcp.internal.max_clients = 102400

#listener.tcp.external.mountpoint = internal/

## Rate Limit. Format is 'burst,rate', Unit is KB/Sec
## listener.tcp.internal.rate_limit = 1000,100

## TCP Socket Options
listener.tcp.internal.backlog = 512

listener.tcp.internal.tune_buffer = on

listener.tcp.internal.buffer = 1MB

listener.tcp.internal.recbuf = 4KB

listener.tcp.internal.sndbuf = 1MB

listener.tcp.internal.nodelay = true

##--------------------------------------------------------------------
## External SSL Listener

## SSL Listener: 8883, 127.0.0.1:8883, ::1:8883
listener.ssl.external = 8883

## Size of acceptor pool
listener.ssl.external.acceptors = 32

## Maximum number of concurrent clients
listener.ssl.external.max_clients = 500000

## listener.ssl.external.mountpoint = inbound/

## Rate Limit. Format is 'burst,rate', Unit is KB/Sec
## listener.ssl.external.rate_limit = 100,10

## Proxy Protocol V1/2
## listener.ssl.external.proxy_protocol = on
## listener.ssl.external.proxy_protocol_timeout = 3s

listener.ssl.external.access.1 = allow all

### SSL Options. See http://erlang.org/doc/man/ssl.html

## Configuring SSL Options. See http://erlang.org/doc/man/ssl.html
### TLS only for POODLE attack
## listener.ssl.external.tls_versions = tlsv1.2,tlsv1.1,tlsv1

### The Ephemeral Diffie-Helman key exchange is a very effective way of
### ensuring Forward Secrecy by exchanging a set of keys that never hit
### the wire. Since the DH key is effectively signed by the private key,
### it needs to be at least as strong as the private key. In addition,
### the default DH groups that most of the OpenSSL installations have
### are only a handful (since they are distributed with the OpenSSL
### package that has been built for the operating system it’s running on)
### and hence predictable (not to mention, 1024 bits only).

### In order to escape this situation, first we need to generate a fresh,
### strong DH group, store it in a file and then use the option above,
### to force our SSL application to use the new DH group. Fortunately,
### OpenSSL provides us with a tool to do that. Simply run:
### openssl dhparam -out dh-params.pem 2048

listener.ssl.external.handshake_timeout = 15s

listener.ssl.external.keyfile = etc/certs/key.pem

listener.ssl.external.certfile = etc/certs/cert.pem

## listener.ssl.external.cacertfile = etc/certs/cacert.pem

## listener.ssl.external.dhfile = etc/certs/dh-params.pem

## listener.ssl.external.verify = verify_peer

## listener.ssl.external.fail_if_no_peer_cert = true

### This is the single most important configuration option of an Erlang SSL application.
### Ciphers (and their ordering) define the way the client and server encrypt information
### over the wire, from the initial Diffie-Helman key exchange, the session key encryption
### algorithm and the message digest algorithm. Selecting a good cipher suite is critical
### for the application’s data security, confidentiality and performance.
### The cipher list above offers:
###
### A good balance between compatibility with older browsers. It can get stricter for Machine-To-Machine scenarios.
### Perfect Forward Secrecy.
### No old/insecure encryption and HMAC algorithms
###
### Most of it was copied from Mozilla’s Server Side TLS article
## listener.ssl.external.ciphers = ECDHE-ECDSA-AES256-GCM-SHA384,ECDHE-RSA-AES256-GCM-SHA384,ECDHE-ECDSA-AES256-SHA384,ECDHE-RSA-AES256-SHA384,ECDHE-ECDSA-DES-CBC3-SHA,ECDH-ECDSA-AES256-GCM-SHA384,ECDH-RSA-AES256-GCM-SHA384,ECDH-ECDSA-AES256-SHA384,ECDH-RSA-AES256-SHA384,DHE-DSS-AES256-GCM-SHA384,DHE-DSS-AES256-SHA256,AES256-GCM-SHA384,AES256-SHA256,ECDHE-ECDSA-AES128-GCM-SHA256,ECDHE-RSA-AES128-GCM-SHA256,ECDHE-ECDSA-AES128-SHA256,ECDHE-RSA-AES128-SHA256,ECDH-ECDSA-AES128-GCM-SHA256,ECDH-RSA-AES128-GCM-SHA256,ECDH-ECDSA-AES128-SHA256,ECDH-RSA-AES128-SHA256,DHE-DSS-AES128-GCM-SHA256,DHE-DSS-AES128-SHA256,AES128-GCM-SHA256,AES128-SHA256,ECDHE-ECDSA-AES256-SHA,ECDHE-RSA-AES256-SHA,DHE-DSS-AES256-SHA,ECDH-ECDSA-AES256-SHA,ECDH-RSA-AES256-SHA,AES256-SHA,ECDHE-ECDSA-AES128-SHA,ECDHE-RSA-AES128-SHA,DHE-DSS-AES128-SHA,ECDH-ECDSA-AES128-SHA,ECDH-RSA-AES128-SHA,AES128-SHA

### SSL parameter renegotiation is a feature that allows a client and
### a server to renegotiate the parameters of the SSL connection on the fly.
### RFC 5746 defines a more secure way of doing this. By enabling secure renegotiation,
### you drop support for the insecure renegotiation, prone to MitM attacks.
## listener.ssl.external.secure_renegotiate = off

### A performance optimization setting, it allows clients to reuse
### pre-existing sessions, instead of initializing new ones.
### Read more about it here.
## listener.ssl.external.reuse_sessions = on

### An important security setting, it forces the cipher to be set based on
### the server-specified order instead of the client-specified order,
### hence enforcing the (usually more properly configured) security
### ordering of the server administrator.
## listener.ssl.external.honor_cipher_order = on

### Use the CN or DN value from the client certificate as a username.
### Notice: 'verify' should be configured as 'verify_peer'
## listener.ssl.external.peer_cert_as_username = cn

## SSL Socket Options
## listener.ssl.external.backlog = 1024

## listener.ssl.external.recbuf = 4KB

## listener.ssl.external.sndbuf = 4KB

## listener.ssl.external.buffer = 4KB

## listener.ssl.external.nodelay = true

##--------------------------------------------------------------------
## External MQTT/WebSocket Listener

listener.ws.external = 8083

listener.ws.external.acceptors = 16

listener.ws.external.max_clients = 250000

listener.ws.external.access.1 = allow all

## TCP Options
listener.ws.external.backlog = 1024

listener.ws.external.recbuf = 4KB

listener.ws.external.sndbuf = 4KB

listener.ws.external.buffer = 4KB

listener.ws.external.nodelay = true

##--------------------------------------------------------------------
## External MQTT/WebSocket/SSL Listener

listener.wss.external = 8084

listener.wss.external.acceptors = 4

listener.wss.external.max_clients = 64

listener.wss.external.access.1 = allow all

## SSL Options
listener.wss.external.handshake_timeout = 15s

listener.wss.external.keyfile = etc/certs/key.pem

listener.wss.external.certfile = etc/certs/cert.pem

## listener.wss.external.cacertfile = etc/certs/cacert.pem

## listener.wss.external.verify = verify_peer

## listener.wss.external.fail_if_no_peer_cert = true

##--------------------------------------------------------------------
## HTTP Management API Listener

listener.api.mgmt = 127.0.0.1:8080

listener.api.mgmt.acceptors = 4

listener.api.mgmt.max_clients = 64

listener.api.mgmt.access.1 = allow all

##-------------------------------------------------------------------
## System Monitor
##-------------------------------------------------------------------

## Long GC, don't monitor in production mode for:
## https://github.com/erlang/otp/blob/feb45017da36be78d4c5784d758ede619fa7bfd3/erts/emulator/beam/erl_gc.c#L421
sysmon.long_gc = false

## Long Schedule(ms)
sysmon.long_schedule = 240

## 8M words. 32MB on 32-bit VM, 64MB on 64-bit VM.
sysmon.large_heap = 8MB

## Busy Port
sysmon.busy_port = false

## Busy Dist Port
sysmon.busy_dist_port = true

This is the log when a node comes up:

listener.ssl.external.acceptors=32
node.max_ets_tables=2097152
node.process_limit=2097152
cluster.k8s.service_name=emqtt
cluster.k8s.service_name=emqtt
listener.ws.external.acceptors=16
cluster.discovery=k8s
cluster.discovery=k8s
node.cookie=secretcookieeeetfggggg
node.name=emqtt
cluster.k8s.app_name=emqtt
cluster.k8s.app_name=emqtt
listener.tcp.external.max_clients=1000000
cluster.k8s.apiserver=https:\/\/kubernetes.default:443
cluster.k8s.apiserver=https:\/\/kubernetes.default:443
cluster.autoclean=5m
cluster.autoclean=5m
cluster.autoheal=on
cluster.autoheal=on
listener.ssl.external.max_clients=500000
node.max_ports=1048576
cluster.k8s.address_type=ip
cluster.k8s.address_type=ip
listener.tcp.external.acceptors=64
log.console=console
cluster.name=emqcl
cluster.name=emqcl
listener.ws.external.max_clients=250000
name=emqtt-1
Node '[email protected]' not responding to pings.
['2017-07-28T23:07:05Z']:waiting emqttd
Node '[email protected]' not responding to pings.
Exec: /opt/emqttd/erts-8.1/bin/erlexec -noshell -noinput +Bd -boot /opt/emqttd/releases/2.3/emqttd -mode embedded -boot_var ERTS_LIB_DIR /opt/emqttd/erts-8.1/../lib -mnesia dir "/opt/emqttd/data/mnesia/emqtt" -config /opt/emqttd/data/configs/app.2017.07.28.23.07.05.config -args_file /opt/emqttd/data/configs/vm.2017.07.28.23.07.05.args -vm_args /opt/emqttd/data/configs/vm.2017.07.28.23.07.05.args -- foreground
Root: /opt/emqttd

=INFO REPORT==== 28-Jul-2017::23:07:06 ===
    alarm_handler: {set,{system_memory_high_watermark,[]}}
starting emqttd on node '[email protected]'
emqttd ctl is starting...[ok]
emqttd hook is starting...[ok]
emqttd router is starting...[ok]
emqttd pubsub is starting...[ok]
emqttd stats is starting...[ok]
emqttd metrics is starting...[ok]
emqttd pooler is starting...[ok]
emqttd trace is starting...[ok]
emqttd client manager is starting...[ok]
emqttd session manager is starting...[ok]
emqttd session supervisor is starting...[ok]
emqttd wsclient supervisor is starting...[ok]
emqttd broker is starting...[ok]
emqttd alarm is starting...[ok]
emqttd mod supervisor is starting...[ok]
emqttd bridge supervisor is starting...[ok]
emqttd access control is starting...[ok]
emqttd system monitor is starting...[ok]
emqttd 2.3 is running now
['2017-07-28T23:07:07Z']:waiting emqttd
['2017-07-28T23:07:07Z']:emqttd start
Load emq_mod_presence module successfully.
Load emq_mod_subscription module successfully.
dashboard:http listen on 0.0.0.0:18083 with 2 acceptors.
mqtt:tcp listen on 127.0.0.1:11883 with 16 acceptors.
mqtt:tcp listen on 0.0.0.0:1883 with 64 acceptors.
mqtt:ws listen on 0.0.0.0:8083 with 16 acceptors.
mqtt:ssl listen on 0.0.0.0:8883 with 32 acceptors.
mqtt:wss listen on 0.0.0.0:8084 with 4 acceptors.
mqtt:api listen on 127.0.0.1:8080 with 4 acceptors.

This is the yaml I'm using:

apiVersion: v1
kind: Service
metadata:
  name: emqtt
  namespace: emqtt
  labels:
    app: emqtt
spec:
  ports:
    - port: 1883
      name: mqtt
    - port: 8883
      name: mqttssl
    - port: 8080
      name: mgmt
    - port: 18083
      name: dashboard
    - port: 4369
      name: mapping
  clusterIP: None
  selector:
    app: emqtt
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: emqtt
  namespace: emqtt
spec:
  replicas: 3
  serviceName: emqtt
  template:
    metadata:
      labels: 
        app: emqtt
    spec:
      containers:
      - name: emqtt
        image: felipejfc/emqtt:v1
        ports:
          - containerPort: 1883
          - containerPort: 8883
          - containerPort: 8080
          - containerPort: 18083
          - containerPort: 4369
          - containerPort: 4370
          - containerPort: 6369
          - containerPort: 6370
          - containerPort: 6371
          - containerPort: 6372
          - containerPort: 6373
          - containerPort: 6374
          - containerPort: 6375
          - containerPort: 6376
          - containerPort: 6377
          - containerPort: 6378
        readinessProbe:
          tcpSocket:
            port: 1883
          initialDelaySeconds: 5
          periodSeconds: 10
        env:
          - name: EMQ_CLUSTER__DISCOVERY
            value: "k8s"
          - name: EMQ_CLUSTER__K8S__APISERVER
            value: "https:\\/\\/kubernetes.default:443"
          - name: EMQ_CLUSTER__K8S__SERVICE_NAME
            value: "emqtt"
          - name: EMQ_CLUSTER__K8S__ADDRESS_TYPE
            value: "ip"
          - name: EMQ_CLUSTER__K8S__APP_NAME
            value: "emqtt"
          - name: EMQ_NODE__NAME
            value: "emqtt"
          - name: EMQ_NODE__COOKIE
            value: "secretcookieeeetfggggg"
          - name: EMQ_CLUSTER__NAME
            value: "emqcl"
          - name: EMQ_CLUSTER__AUTOHEAL
            value: "on"
          - name: EMQ_CLUSTER__AUTOCLEAN
            value: "5m"

I don't see any feedback from ekka on the logs, any hints?

consul.io auto cluster support

any plans to add support for emqx autocluster via consul.io?
or maybe there is some workaround to achieve emqx peer discovery with consul.io

cluster discovey use k8s error

error

19:29:57.653 [error] Supervisor 'esockd_connection_sup - <0.1418.0>' had child connection started with mochiweb_http:start_link({emq_dashboard,handle_request,[{state,"/opt/emqttd/lib/emq_dashboard-2.3.11/priv/www",#Fun<emq_dash..>}]}) at <0.1766.0> exit with reason {json_encode,{bad_term,{error,nodedown}}} in context connection_crashed

version

emqttd 2.3.11
k8s 1.11.3

Crash when starting a replicant node.

I start a core node by make run, and want to join a replicant node to the core node.
But the replicant node just crash:

EMQX_CLUSTER__CORE_NODES='[email protected]' EMQX_NODE__DB_ROLE=replicant EMQX_LOG__FILE_HANDLERS__DEFAULT__ENABLE=false EMQX_LOG__CONSOLE_HANDLER__LEVEL=info EMQX_LOG__CONSOLE_HANDLER__ENABLE=true [email protected] EMQX_LISTENERS__WS__DEFAULT__PROXY_PROTOCOL=true EMQX_LISTENERS__TCP__DEFAULT__PROXY_PROTOCOL=true EMQX_LISTENERS__TCP__DEFAULT__BIND='0.0.0.0:1881' EMQX_LISTENERS__SSL__DEFAULT__BIND='0.0.0.0:8881' EMQX_LISTENERS__WS__DEFAULT__BIND='0.0.0.0:8081' EMQX_LISTENERS__WSS__DEFAULT__BIND='0.0.0.0:8094' EMQX_DASHBOARD__LISTENERS__HTTP__BIND='0.0.0.0:18081' EMQX_NODE__DATA_DIR='./data2' ./_build/emqx/rel/emqx/bin/emqx console
2022-12-28T20:34:35.955734+08:00 [error] crasher: initial call: mria_lb:init/1, pid: <0.2147.0>, registered_name: mria_lb, error: {function_clause,[{lists,usort,[ignore],[{file,"lists.erl"},{line,1063}]},{mria_lb,list_core_nodes,1,[{file,"mria_lb.erl"},{line,164}]},{mria_lb,do_update,1,[{file,"mria_lb.erl"},{line,140}]},{mria_lb,handle_info,2,[{file,"mria_lb.erl"},{line,89}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,695}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,771}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [mria_rlog_sup,mria_sup,<0.2075.0>], message_queue_len: 0, messages: [], links: [<0.2145.0>], dictionary: [{'$logger_metadata$',#{domain => [mria,rlog,lb]}},{rand_seed,{#{bits => 58,jump => #Fun<rand.3.92093067>,next => #Fun<rand.0.92093067>,type => exsss,uniform => #Fun<rand.1.92093067>,uniform_n => #Fun<rand.2.92093067>},[11155160292792974|260709835617318854]}}], trap_exit: true, status: running, heap_size: 4185, stack_size: 29, reductions: 10281; neighbours:

k8s clustering doesn't work

I have followed the documentation and used this setup:

apiVersion: v1
kind: Service
metadata:
  name: emqx
spec:
  ports:
  - port: 32333
    nodePort: 32333
    targetPort:  emqx-dashboard
    protocol: TCP
  selector:
    app: emqx
  type: NodePort
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: emqx
  labels:
        app: emqx
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: emqx
    spec:
      containers:
      - name: emqx
        image: emqx/emqx:latest
        ports:
        - name: emqx-dashboard
          containerPort: 18083
        env:
        - name: EMQX_CLUSTER__DISCOVERY
          value: k8s
        - name: EMQX_NAME
          value: emqx
        - name: EMQX_CLUSTER__K8S__APISERVER
          value: "https://kubernetes.default:443"
        - name: EMQX_CLUSTER__K8S__NAMESPACE
          value: default
        - name: EMQX_CLUSTER__K8S__SERVICE_NAME
          value: emqx
        - name: EMQX_CLUSTER__K8S__ADDRESS_TYPE
          value: ip
        - name: EMQX_CLUSTER__K8S__APP_NAME
          value: emqx
        tty: true

But when checking the logs i see that emqx gives the following message:

2019-04-01 07:56:53.613 [error] Ekka(AutoCluster): Discovery error: {403,
1-4-2019 09:56:53 "{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"endpoints \\\"emqx\\\" is forbidden: User \\\"system:serviceaccount:default:default\\\" cannot get resource \\\"endpoints\\\" in API group \\\"\\\" in the namespace \\\"default\\\"\",\"reason\":\"Forbidden\",\"details\":{\"name\":\"emqx\",\"kind\":\"endpoints\"},\"code\":403}\n"}

Is there something i need to do to make the api accessible?
I am running k8s via rancher.

New logs stop being print out on console in Kubernetes

I have configured ekka to cluster using Kubernetes. After EMQ borker running for a while, it stops printing out any new logs on the console. The same thing happens to "kubectl logs" command which only returns the old logs. Although I can find the latest logs in elang.log file.

EMQ version: 2.4
Kubernetes version: client 1.9.3, sever version 1.9.8

If first connection fails, all subsequent will fail

Hi all,

In the scenario that a node attempts to connect with a bad ssl_dist key/cert pair, the connect will fail. Restarting the same node with the a correct pair will fail every time thereafter, as if the node was blacklisted.

The only way to remedy this as of now is to reboot a node in the active, connected cluster to clear its memory of the "blacklisted" node, and then reboot the blacklisted node.

Autoheal: handle multiple partitions

Currently ekka_autoheal handles the situation where the cluster splits in two parts. In some cases, however, 3 or more partitions may occur at the same time. Currently this situation cannot be recovered automatically.

[K8s] Deleting pods takes the EMQ cluster to an inconsistent state

Hey guys, firstly, thanks for this awesome project.

I have a scenario, for instance, having 2 pods, if I delete both of them using kubectl, the kubernetes deployment will create newly ones in order to keep the minimum number of replicas (e.g. 2). That's ok.

It turns out that sometimes those new pods won't join the EMQ cluster successfully, thus I end up having two separate nodes not joined to the same cluster. Such issue happens sometimes, I'd say 50% of times.

Configuration is OK (that's why it works sometimes), I guess that somehow ekka does not know which pods are ok to be joined during the discover and join phase.

Do you guys have any clue? It's an Ekka issue or something I can investigate further?

Thanks in advance!

Feature: Support SRV DNS records?

any chance ekka could support clustering via a srv record resolution? AWS allows this via ECS and fargate, and I believe it's becoming a more common solution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.