Coder Social home page Coder Social logo

osism / testbed Goto Github PK

View Code? Open in Web Editor NEW
60.0 9.0 25.0 11.54 MB

With this testbed, it is possible to run a full OSISM installation, the baseline of the Sovereign Cloud Stack, on an existing OpenStack environment such as Cleura or REGIO.cloud.

Home Page: https://osism.tech/docs/guides/other-guides/testbed

License: Apache License 2.0

Python 16.77% Shell 50.44% Perl 7.26% Makefile 12.02% HCL 6.77% Jinja 2.48% RobotFramework 4.26%
openstack ceph ansible terraform kubernetes

testbed's Introduction

OSISM testbed

Documentation

With the OSISM Testbed, it is possible to run a full Sovereign Cloud Stack deployment on an existing OpenStack environment such as Cleura or REGIO.cloud.

OSISM is the reference implementation for the infrastructure as a service layer in the Sovereign Cloud Stack (SCS) project. The OSISM Testbed is therefore used in the SCS project to test and work on the Instrastructure as a Service layer.

The OSISM Testbed is intended as a playground. Further services and integration will be added over time. More and more best practices and experiences from the productive deployments will be included here in the future. It will become more production-like over time. However, at no point does it claim to represent a production setup exactly.

testbed's People

Contributors

akafazov avatar arnaudmorin avatar artificial-intelligence avatar berendt avatar cah-link avatar curx avatar eddymaestrodev avatar ferenc- avatar fkr avatar frosty-geek avatar garloff avatar github-actions[bot] avatar janhorstmann avatar juanptm avatar linwalth avatar matfechner avatar matofeder avatar nerdicbynature avatar nils98ar avatar osfrickler avatar ra-beer avatar renovate[bot] avatar reqa avatar richie1710 avatar scoopex avatar swaroopar avatar tibeer avatar tobberydberg avatar yeoldegrove avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

testbed's Issues

Autodetect drive names and allow overriding

With PR#70, we introduced drives_vdx to optionally support the commonly used vdx (virtio-blk) drive names in addition to the default scsi (virtio-scsi) sdx names.
This should be improved two-fold:

  1. Drives names can be autodetected, which should work in 99% of the cases. Main complication is that we need to detect drive names on the HCI nodes and have the info available on the manager node.
  2. We still would allow for a manual override.

Fix broken refstack checks

tempest.api.object_storage.test_account_quotas_negative.AccountQuotasNegativeTest.test_user_modify_quota
tempest.api.object_storage.test_account_quotas.AccountQuotasTest.test_upload_valid_object
IdentityV3ProjectsTest.test_list_projects_returns_only_authorized_projects

testbed: playbook_cronjobs.yml fails with operator_user undefined

The task includes an option with an undefined variable. The error was: 'operator_user' is undefined.

And indeed, it is not set anywhere except for terraform/files/node.yml which does not appear to be used.
Logging in and setting operator_user = dragon in environments/ansible.cfg and rerunning the playbook makes it succeed.

Autodetect NIC names and allow overriding

Like issue#79, just for NIC names.
The first hurdle is for the nodes to get their meta-data, so they can run the user-data script.
On clouds with networked meta-data (rather than ConfigDrive), this means that cloud-init needs to guess correctly the first NIC, bring it up and retrieve the meta-data/user-data. Nothing OSISM testebed can do here, up to the cloud-provider and cloud-init. (But see cloud-init PRs #234 and #235.)

Now we should do:

  1. Autodetect NIC names. In some cases, we will have one iface up (by cloud-init) which is the first then. In some cases, the cloud may have provided network_data to us, which should be trusted then. Otherwise, use an heuristic that uses the order of the PCI bus ...
  2. We should have the ability to override it from the heat environment if needed. (Heuristics can go wrong ...)

Allow nested virt on physical installs

On the compute hosts (HCI nodes), we should have a file /etc/modprobe.d/kvm.conf with

options kvm_intel nested=1
options kvm_amd nested=1

to allow VMs with nested virtualization. This will be useful for self-hosting (putting virtual SCS (testbed) on physical SCS).

(cosmetic): 99-netbox.yml warnings

[WARNING]:  * Failed to parse /ansible/inventory/99-netbox.yml with auto
plugin: {"detail":"You do not have permission to perform this action."}
[WARNING]:  * Failed to parse /ansible/inventory/99-netbox.yml with yaml
plugin: Plugin configuration YAML file, not YAML inventory
[WARNING]:  * Failed to parse /ansible/inventory/99-netbox.yml with ini plugin:
Invalid host pattern '---' supplied, '---' is normally a sign this is a YAML file.
[WARNING]: Unable to parse /ansible/inventory/99-netbox.yml as an inventory source

Running any playbook basically spits out these in testbed. They seem to be harmless, but annoying ...

Barbican not working

$ openstack --os-cloud admin
(openstack) secret list

(openstack) secret store --name mysecret --payload j4=]d21
5xx Server error: Internal Server Error: Secret creation failure seen - please contact site administrator.
Internal Server Error: Secret creation failure seen - please contact site administrator.
(openstack)

Support ironic

In SCS deployments, we may have scenarios where the virtualization overhead is too much and does not outweigh the flexibility and sharing that it brings. In that case, we'd still want to manage the hardware with OpenStack and use ironic rather than nova (or go the OTC way, hiding ironic and expose bare-metal flavors via nova).
I suspect that over time, more and more k8s focused deployments will use ironic.
https://metal3.io/blog/2019/10/31/OpenStack-Ironic-and-Bare-Metal-Infrastructure_All-Abstractions-Start-Somewhere.html

Now, ironic in testbed will be interesting, as conceptually, we would ask the underlaying provisioning platform (which is the lower OpenStack) to provide a machine here and then still hook it up to the networks ... of the upper OpenStack. I have to admit that it's not clear to me that there are enough similarities with a production (hardware) setup that we still use testbed to meaningfully explore and test it ... So that would need consideration and we might conclude that ironic would be a non-testbed-only feature.

Network numbering (192.168.x0.0/24)

When thinking about growing the environments, I would like us to be prepared.
Right now we have internal networks with 192.168.x0.0/24, which would allow for ~250 hosts in one cloud region.
Our networking solution (OvS) does not scale so greatly so, we may not be able to do larger deployments soon, but we may be successful in addressing this down the line.
At some point we might cross that line (especially in multi-AZ environments), so it might be less painful to do the change now.

I wonder whether it would be prudent to use properly aligned /20 networks, so
192.168.40/24 would become 192.168.16/20 (or maybe /22 or /23 for now, could be easily adjusted later).
192.168.40 -> 192.168.16
192.168.50 -> 192.168.32
192.168.60 -> 192.168.48
192.168.70 -> 192.168.64
192.168.80 -> 192.168.80
192.168.90 -> 192.168.96
192.168.100 -> 192.168.112
Thinking about multi-AZ setups, this might also leave clean ways to to things like 192.168.16/23 for AZ1, 192.168.18/23 AZ2 ...
It would allow for ~4k hosts large regions -- something I do not expect to cross in the next decade.

/root/cleanup.sh does not exist

/root/manager.sh calls into /root/cleanup.sh, but that file does not exist. It's not injected via user-data ...
This may be harmless, but I'm not completely sure.

Robustness against temporary internet/DNS failure

During testbed deployments, I saw a few "temporary failure in DNS resolution" issues.
Honestly, I don't know why these happen relatively frequently (more than 1%) these days on CityCloud ...
No retry was done, I had to manually re-rollout the affected playbooks.
Should there be some higher level of robustness against issues by retrying once or twice?

Testbad is failing in my openstack cloud

Hi berendt & team,

First of all, thank you very much for your effort and this amazing project of yours.

I have 3 bare metal servers (16 CPU, 64GB RAM) with openstack deployed on top of them via kolla-ansible. Thus, I ve decided to test and see how it goes with your testbed deployment on top of my openstack. So far so good, however, I ve been able to deploy everything except openstack :)

What happened:

I started:

openstack` --os-cloud testbed \
  stack create \
  -e heat/environment.yml \
  --parameter deploy_ceph=true \
  --parameter deploy_infrastructure=true \
  --timeout 150 \
  -t heat/stack.yml testbed

As you can see I ve used only ceph and infra deployment therefore, my heat stack was finished complete.

+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| ID                                   | Stack Name | Project                          | Stack Status    | Creation Time        | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| a7bca5cc-e36a-4f6a-8408-193aca23c93b | testbed    | 3ef77e22e984498d98841e14bf6dfc08 | CREATE_COMPLETE | 2020-05-02T18:54:47Z | None         |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+

Before that I tried to deploy everything at once and it failed hence I decided that I am going to run openstack deployment script manually. So after stack has been deployed, I ssh'd into manager and ran: /opt/configuration/scripts/deploy_openstack_services_basic.sh

It failed with the following errors:

TASK [keystone : Check keystone containers] ************************************
changed: [testbed-node-1.osism.local] => (item={'key': 'keystone', 'value': {'container_name': 'keystone', 'group': 'keystone', 'enabled': True, 'image': 'quay.io/osism/keystone:train-latest', 'volumes': ['/etc/kolla/keystone/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/etc/timezone:/etc/timezone:ro', '', 'kolla_logs:/var/log/kolla/', 'keystone_fernet_tokens:/etc/keystone/fernet-keys'], 'dimensions': {}, 'haproxy': {'keystone_internal': {'enabled': True, 'mode': 'http', 'external': False, 'port': '5000', 'listen_port': '5000'}, 'keystone_external': {'enabled': True, 'mode': 'http', 'external': True, 'port': '5000', 'listen_port': '5000'}, 'keystone_admin': {'enabled': True, 'mode': 'http', 'external': False, 'port': '35357', 'listen_port': '35357'}}}})
changed: [testbed-node-0.osism.local] => (item={'key': 'keystone', 'value': {'container_name': 'keystone', 'group': 'keystone', 'enabled': True, 'image': 'quay.io/osism/keystone:train-latest', 'volumes': ['/etc/kolla/keystone/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/etc/timezone:/etc/timezone:ro', '', 'kolla_logs:/var/log/kolla/', 'keystone_fernet_tokens:/etc/keystone/fernet-keys'], 'dimensions': {}, 'haproxy': {'keystone_internal': {'enabled': True, 'mode': 'http', 'external': False, 'port': '5000', 'listen_port': '5000'}, 'keystone_external': {'enabled': True, 'mode': 'http', 'external': True, 'port': '5000', 'listen_port': '5000'}, 'keystone_admin': {'enabled': True, 'mode': 'http', 'external': False, 'port': '35357', 'listen_port': '35357'}}}})
changed: [testbed-node-1.osism.local] => (item={'key': 'keystone-ssh', 'value': {'container_name': 'keystone_ssh', 'group': 'keystone', 'enabled': True, 'image': 'quay.io/osism/keystone-ssh:train-latest', 'volumes': ['/etc/kolla/keystone-ssh/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/etc/timezone:/etc/timezone:ro', 'kolla_logs:/var/log/kolla/', 'keystone_fernet_tokens:/etc/keystone/fernet-keys'], 'dimensions': {}}})
changed: [testbed-node-0.osism.local] => (item={'key': 'keystone-ssh', 'value': {'container_name': 'keystone_ssh', 'group': 'keystone', 'enabled': True, 'image': 'quay.io/osism/keystone-ssh:train-latest', 'volumes': ['/etc/kolla/keystone-ssh/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/etc/timezone:/etc/timezone:ro', 'kolla_logs:/var/log/kolla/', 'keystone_fernet_tokens:/etc/keystone/fernet-keys'], 'dimensions': {}}})
changed: [testbed-node-1.osism.local] => (item={'key': 'keystone-fernet', 'value': {'container_name': 'keystone_fernet', 'group': 'keystone', 'enabled': True, 'image': 'quay.io/osism/keystone-fernet:train-latest', 'volumes': ['/etc/kolla/keystone-fernet/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/etc/timezone:/etc/timezone:ro', 'kolla_logs:/var/log/kolla/', 'keystone_fernet_tokens:/etc/keystone/fernet-keys'], 'dimensions': {}}})
changed: [testbed-node-0.osism.local] => (item={'key': 'keystone-fernet', 'value': {'container_name': 'keystone_fernet', 'group': 'keystone', 'enabled': True, 'image': 'quay.io/osism/keystone-fernet:train-latest', 'volumes': ['/etc/kolla/keystone-fernet/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/etc/timezone:/etc/timezone:ro', 'kolla_logs:/var/log/kolla/', 'keystone_fernet_tokens:/etc/keystone/fernet-keys'], 'dimensions': {}}})

TASK [keystone : include_tasks] ************************************************
skipping: [testbed-node-0.osism.local]
skipping: [testbed-node-1.osism.local]

TASK [keystone : include_tasks] ************************************************
included: /ansible/roles/keystone/tasks/bootstrap.yml for testbed-node-0.osism.local, testbed-node-1.osism.local

TASK [keystone : Creating keystone database] ***********************************
fatal: [testbed-node-0.osism.local -> 192.168.40.10]: FAILED! => {"changed": false, "msg": "kolla_toolbox container is not running."}

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
testbed-manager.osism.local : ok=9    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
testbed-node-0.osism.local : ok=24   changed=1    unreachable=0    failed=1    skipped=8    rescued=0    ignored=0
testbed-node-1.osism.local : ok=22   changed=1    unreachable=0    failed=0    skipped=7    rescued=0    ignored=0
testbed-node-2.osism.local : ok=9    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0


PLAY [Prepare masquerading on the manager node] ********************************

TASK [Accpet FORWARD on the management interface (incoming)] *******************
ok: [testbed-manager.osism.local]

TASK [Accept FORWARD on the management interface (outgoing)] *******************
ok: [testbed-manager.osism.local]

TASK [Masquerade traffic on the management interface] **************************
ok: [testbed-manager.osism.local]

PLAY [Bootstrap basic OpenStack services] **************************************

TASK [Create test project] *****************************************************
fatal: [localhost]: FAILED! => {"changed": false, "module_stderr": "Failed to discover available identity versions when contacting http://api.osism.local:5000/v3. Attempting to parse version from URL.\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/connection.py\", line 160, in _new_conn\n    (self._dns_host, self.port), self.timeout, **extra_kw\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py\", line 84, in create_connection\n    raise err\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py\", line 74, in create_connection\n    sock.connect(sa)\nOSError: [Errno 113] No route to host\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py\", line 677, in urlopen\n    chunked=chunked,\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py\", line 392, in _make_request\n    conn.request(method, url, **httplib_request_kw)\n  File \"/usr/lib/python3.6/http/client.py\", line 1264, in request\n    self._send_request(method, url, body, headers, encode_chunked)\n  File \"/usr/lib/python3.6/http/client.py\", line 1310, in _send_request\n    self.endheaders(body, encode_chunked=encode_chunked)\n  File \"/usr/lib/python3.6/http/client.py\", line 1259, in endheaders\n    self._send_output(message_body, encode_chunked=encode_chunked)\n  File \"/usr/lib/python3.6/http/client.py\", line 1038, in _send_output\n    self.send(msg)\n  File \"/usr/lib/python3.6/http/client.py\", line 976, in send\n    self.connect()\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/connection.py\", line 187, in connect\n    conn = self._new_conn()\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/connection.py\", line 172, in _new_conn\n    self, \"Failed to establish a new connection: %s\" % e\nurllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f0765c2e7b8>: Failed to establish a new connection: [Errno 113] No route to host\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/dist-packages/requests/adapters.py\", line 449, in send\n    timeout=timeout\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py\", line 725, in urlopen\n    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]\n  File \"/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py\", line 439, in increment\n    raise MaxRetryError(_pool, url, error or ResponseError(cause))\nurllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='api.osism.local', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0765c2e7b8>: Failed to establish a new connection: [Errno 113] No route to host',))\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py\", line 1004, in _send_request\n    resp = self.session.request(method, url, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/requests/sessions.py\", line 530, in request\n    resp = self.send(prep, **send_kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/requests/sessions.py\", line 643, in send\n    r = adapter.send(request, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/requests/adapters.py\", line 516, in send\n    raise ConnectionError(e, request=request)\nrequests.exceptions.ConnectionError: HTTPConnectionPool(host='api.osism.local', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0765c2e7b8>: Failed to establish a new connection: [Errno 113] No route to host',))\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"<stdin>\", line 102, in <module>\n  File \"<stdin>\", line 94, in _ansiballz_main\n  File \"<stdin>\", line 40, in invoke_module\n  File \"/usr/lib/python3.6/runpy.py\", line 205, in run_module\n    return _run_module_code(code, init_globals, run_name, mod_spec)\n  File \"/usr/lib/python3.6/runpy.py\", line 96, in _run_module_code\n    mod_name, mod_spec, pkg_name, script_name)\n  File \"/usr/lib/python3.6/runpy.py\", line 85, in _run_code\n    exec(code, run_globals)\n  File \"/tmp/ansible_os_project_payload_2snpvh5o/ansible_os_project_payload.zip/ansible/modules/cloud/openstack/os_project.py\", line 211, in <module>\n  File \"/tmp/ansible_os_project_payload_2snpvh5o/ansible_os_project_payload.zip/ansible/modules/cloud/openstack/os_project.py\", line 174, in main\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/_identity.py\", line 99, in get_project\n    domain_id=domain_id)\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/_utils.py\", line 205, in _get_entity\n    entities = search(name_or_id, filters, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/_identity.py\", line 84, in search_projects\n    domain_id=domain_id, name_or_id=name_or_id, filters=filters)\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/_identity.py\", line 56, in list_projects\n    if self._is_client_version('identity', 3):\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/openstackcloud.py\", line 461, in _is_client_version\n    client = getattr(self, client_name)\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/_identity.py\", line 32, in _identity_client\n    'identity', min_version=2, max_version='3.latest')\n  File \"/usr/local/lib/python3.6/dist-packages/openstack/cloud/openstackcloud.py\", line 408, in _get_versioned_client\n    if adapter.get_endpoint():\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/adapter.py\", line 282, in get_endpoint\n    return self.session.get_endpoint(auth or self.auth, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py\", line 1225, in get_endpoint\n    return auth.get_endpoint(self, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py\", line 380, in get_endpoint\n    allow_version_hack=allow_version_hack, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py\", line 271, in get_endpoint_data\n    service_catalog = self.get_access(session).service_catalog\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/base.py\", line 134, in get_access\n    self.auth_ref = self.get_auth_ref(session)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/generic/base.py\", line 208, in get_auth_ref\n    return self._plugin.get_auth_ref(session, **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/identity/v3/base.py\", line 184, in get_auth_ref\n    authenticated=False, log=False, **rkwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py\", line 1131, in post\n    return self.request(url, 'POST', **kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py\", line 913, in request\n    resp = send(**kwargs)\n  File \"/usr/local/lib/python3.6/dist-packages/keystoneauth1/session.py\", line 1020, in _send_request\n    raise exceptions.ConnectFailure(msg)\nkeystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to http://api.osism.local:5000/v3/auth/tokens: HTTPConnectionPool(host='api.osism.local', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f0765c2e7b8>: Failed to establish a new connection: [Errno 113] No route to host',))\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

PLAY RECAP *********************************************************************
localhost                  : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0
testbed-manager.osism.local : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Then I ssh'd into one of the nodes and noticed that both keepalived and haproxy are restarting with errors:

docker logs haproxy
standard_init_linux.go:211: exec user process caused "exec format error"
standard_init_linux.go:211: exec user process caused "exec format error"


docker logs keepalived
standard_init_linux.go:211: exec user process caused "exec format error"
standard_init_linux.go:211: exec user process caused "exec format error"

Thus, I could not figure out why they were failing and haproxy cannot implement virtual ip. Maybe it somehow related that this deployment is running inside another openstack :) Nevertheless, ceph is running fine and nodes setup was finished successfully I guess.

My environment file:

---
parameter_defaults:
  availability_zone: nova
  volume_availability_zone: nova
  network_availability_zone: nova
  flavor_node: 4C-16GB-40GB
  flavor_manager: 4C-4GB-20GB
  image: bionic
  public: external
  volume_size_storage: 10
  ceph_version: octopus
  openstack_version: train
  configuration_version: master

Please let me know if you can help me out to troubleshoot that and if you need any additional info on that.

Thank you & Regards

Enable nested virtualization if possible

If KVM is available (e.g. because we're running on BareMetal, or the host enabled nested virtualization), we should use it.
i.e. nova should pass --machine accel=kvm to qemu-system-x86_64.
We should consider using --cpu host then as well to avoid any mismatch. (Currently we seem to have EPYC hardcoded, curious why not EPYC-IBPB).
The kvm_ok tool in the cpu-checker package allows to detect this -- checking for the svm or vmx flags is possible as well.

Allow s3 access via https

In testbed, I can create ec2 credentials and then access the swift containers as s3 buckets and the same objects. Nice!
However, I could not find a way to allow SSL/TLS connections using s3.

Here's what I did:
openstack ec2 credentials create
Store result into S3_ACCESS_KEY_ID and S3_SECRET_ACCESS_KEY environment variables, set S3_HOSTNAME to api.osism.local:8080 (same as swift endpoint).

garloff@xps13kurt(testbed):/casa/Images/SUSE15 [0]$ openstack object list testbucket
+---------------------------------------------+
| Name                                        |
+---------------------------------------------+
| openSUSE-15.2-JeOS.x86_64-1.15.2.4.packages |
| openSUSE-15.2-JeOS.x86_64-1.15.2.4.qcow2    |
+---------------------------------------------+
garloff@xps13kurt(testbed):/casa/Images/SUSE15 [0]$ s3 -u list testbucket
                       Key                             Last Modified      Size 
--------------------------------------------------  --------------------  -----
openSUSE-15.2-JeOS.x86_64-1.15.2.4.packages         2020-09-10T14:00:46Z  92648
openSUSE-15.2-JeOS.x86_64-1.15.2.4.qcow2            2020-09-10T14:05:43Z   581M
garloff@xps13kurt(testbed):/casa/Images/SUSE15 [0]$ s3 list testbucket

ERROR: InternalError

ceph: fix degraded data redundancy

After activation of the RGW service, the following warning appears if only 2 storage nodes are activated.

dragon@testbed-manager:~$ ceph -s
  cluster:
    id:     ce766f84-6dde-4ba0-9c57-ddb62431f1cd
    health: HEALTH_WARN
            Degraded data redundancy: 6/682 objects degraded (0.880%), 5 pgs degraded, 32 pgs undersized
 
  services:
    mon: 2 daemons, quorum testbed-node-0,testbed-node-1 (age 61m)
    mgr: testbed-node-1(active, since 60m), standbys: testbed-node-0
    mds: cephfs:1 {0=testbed-node-1=up:active} 1 up:standby
    osd: 4 osds: 4 up (since 59m), 4 in (since 59m)
    rgw: 2 daemons active (testbed-node-0.rgw0, testbed-node-1.rgw0)
 
  data:
    pools:   13 pools, 424 pgs
    objects: 338 objects, 330 MiB
    usage:   4.7 GiB used, 35 GiB / 40 GiB avail
    pgs:     6/682 objects degraded (0.880%)
             392 active+clean
             27  active+undersized
             5   active+undersized+degraded

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.