Model Controller Cloud/Region Version SLA Timestamp
test-backups-23v7 github-pr-2fa19-lxd localhost/localhost 3.1.7 unsupported 14:16:34Z
App Version Status Scale Charm Channel Rev Exposed Message
opensearch maintenance 3 opensearch 0 no Beginning rolling service
s3-integrator active 1 s3-integrator stable 13 no
self-signed-certificates active 1 self-signed-certificates stable 72 no
Unit Workload Agent Machine Public address Ports Message
opensearch/0 waiting idle 2 10.241.94.24 Waiting for OpenSearch to start...
opensearch/1* waiting executing 3 10.241.94.127 Waiting for OpenSearch to start...
opensearch/2 error idle 4 10.241.94.7 hook failed: "opensearch-peers-relation-changed"
s3-integrator/0* active idle 1 10.241.94.199
self-signed-certificates/0* active idle 0 10.241.94.194
Machine State Address Inst id Base AZ Message
0 started 10.241.94.194 juju-ad79f3-0 [email protected] Running
1 started 10.241.94.199 juju-ad79f3-1 [email protected] Running
2 started 10.241.94.24 juju-ad79f3-2 [email protected] Running
3 started 10.241.94.127 juju-ad79f3-3 [email protected] Running
4 started 10.241.94.7 juju-ad79f3-4 [email protected] Running
Integration provider Requirer Interface Type Message
opensearch:opensearch-peers opensearch:opensearch-peers opensearch_peers peer
opensearch:service opensearch:service rolling_op peer
s3-integrator:s3-credentials opensearch:s3-credentials s3 regular
s3-integrator:s3-integrator-peers s3-integrator:s3-integrator-peers s3-integrator-peers peer
self-signed-certificates:certificates opensearch:certificates tls-certificates regular
Storage Unit Storage ID Type Pool Mountpoint Size Status Message
opensearch/0 opensearch-data/0 filesystem rootfs /var/snap/opensearch/common 72 GiB attached
opensearch/1 opensearch-data/1 filesystem rootfs /var/snap/opensearch/common 72 GiB attached
opensearch/2 opensearch-data/2 filesystem rootfs /var/snap/opensearch/common 72 GiB attached
Unit opensearch/2 is failing because all shards disappeared at once. At 14:15:01, everything is there:
2024-02-07 14:15:01 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: https://10.241.94.7:9200 "GET /_cat/indices?v HTTP/1.1" 200 606
2024-02-07 14:15:01 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: Getting secret app:admin-password
2024-02-07 14:15:01 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: Starting new HTTPS connection (1): 10.241.94.7:9200
2024-02-07 14:15:01 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: https://10.241.94.7:9200 "GET /_cat/shards?v HTTP/1.1" 200 1774
2024-02-07 14:15:01 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: indices status:
[{'health': 'green', 'status': 'open', 'index': '.opensearch-observability', 'uuid': '_pC83dCcQVuYW5VWnbasfg', 'pri': '1', 'rep': '2', 'docs.count': '0', 'docs.deleted': '0', 'store.size': '416b', 'pri.store.size': '208b'}, {'health': 'green', 'status': 'open', 'index': '.plugins-ml-config', 'uuid': 'RXeUPKcXToOBpRQPM4VQEA', 'pri': '1', 'rep': '2', 'docs.count': '1', 'docs.deleted': '0', 'store.size': '7.7kb', 'pri.store.size': '3.8kb'}, {'health': 'green', 'status': 'open', 'index': '.opendistro_security', 'uuid': 'FXZtZFctTWyEjVt9ZDRqcw', 'pri': '1', 'rep': '2', 'docs.count': '10', 'docs.deleted': '2', 'store.size': '105.2kb', 'pri.store.size': '57.8kb'}]
indices shards:
[{'index': '.plugins-ml-config', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': '1', 'store': '0b', 'ip': '10.241.94.7', 'node': 'opensearch-2'}, {'index': '.plugins-ml-config', 'shard': '0', 'prirep': 'p', 'state': 'STARTED', 'docs': '1', 'store': '3.8kb', 'ip': '10.241.94.127', 'node': 'opensearch-1'}, {'index': '.plugins-ml-config', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': '1', 'store': '3.9kb', 'ip': '10.241.94.24', 'node': 'opensearch-0'}, {'index': '.opensearch-observability', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': '0', 'store': '0b', 'ip': '10.241.94.7', 'node': 'opensearch-2'}, {'index': '.opensearch-observability', 'shard': '0', 'prirep': 'p', 'state': 'STARTED', 'docs': '0', 'store': '208b', 'ip': '10.241.94.127', 'node': 'opensearch-1'}, {'index': '.opensearch-observability', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': '0', 'store': '208b', 'ip': '10.241.94.24', 'node': 'opensearch-0'}, {'index': '.opensearch-sap-log-types-config', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': None, 'store': None, 'ip': '10.241.94.7', 'node': 'opensearch-2'}, {'index': '.opensearch-sap-log-types-config', 'shard': '0', 'prirep': 'p', 'state': 'STARTED', 'docs': None, 'store': None, 'ip': '10.241.94.127', 'node': 'opensearch-1'}, {'index': '.opensearch-sap-log-types-config', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': None, 'store': None, 'ip': '10.241.94.24', 'node': 'opensearch-0'}, {'index': '.opendistro_security', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': '10', 'store': '0b', 'ip': '10.241.94.7', 'node': 'opensearch-2'}, {'index': '.opendistro_security', 'shard': '0', 'prirep': 'p', 'state': 'STARTED', 'docs': '10', 'store': '57.8kb', 'ip': '10.241.94.127', 'node': 'opensearch-1'}, {'index': '.opendistro_security', 'shard': '0', 'prirep': 'r', 'state': 'STARTED', 'docs': '10', 'store': '47.4kb', 'ip': '10.241.94.24', 'node': 'opensearch-0'}]
2024-02-07 14:16:21 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: https://10.241.94.7:9200 "GET /_cat/shards?v HTTP/1.1" 200 555
2024-02-07 14:16:21 DEBUG unit.opensearch/2.juju-log server.go:325 opensearch-peers:1: indices status:
[{'health': 'green', 'status': 'open', 'index': '.opensearch-observability', 'uuid': '_pC83dCcQVuYW5VWnbasfg', 'pri': '1', 'rep': '0', 'docs.count': '0', 'docs.deleted': '0', 'store.size': '208b', 'pri.store.size': '208b'}, {'health': 'green', 'status': 'open', 'index': '.plugins-ml-config', 'uuid': 'RXeUPKcXToOBpRQPM4VQEA', 'pri': '1', 'rep': '0', 'docs.count': '1', 'docs.deleted': '0', 'store.size': '3.9kb', 'pri.store.size': '3.9kb'}, {'health': 'red', 'status': 'open', 'index': '.opendistro_security', 'uuid': 'FXZtZFctTWyEjVt9ZDRqcw', 'pri': '1', 'rep': '0', 'docs.count': None, 'docs.deleted': None, 'store.size': None, 'pri.store.size': None}]
indices shards:
[{'index': '.opensearch-observability', 'shard': '0', 'prirep': 'p', 'state': 'STARTED', 'docs': '0', 'store': '208b', 'ip': '10.241.94.7', 'node': 'opensearch-2'}, {'index': '.plugins-ml-config', 'shard': '0', 'prirep': 'p', 'state': 'STARTED', 'docs': '1', 'store': '3.9kb', 'ip': '10.241.94.7', 'node': 'opensearch-2'}, {'index': '.opensearch-sap-log-types-config', 'shard': '0', 'prirep': 'p', 'state': 'UNASSIGNED', 'docs': None, 'store': None, 'ip': None, 'node': None}, {'index': '.opendistro_security', 'shard': '0', 'prirep': 'p', 'state': 'UNASSIGNED', 'docs': None, 'store': None, 'ip': None, 'node': None}]
That means, when the node opensearch/2 is running its routine, all nodes are gone and it fails.
It is happening because both opensearch 0 and 1 are deferring their RunWithLocks event: