Coder Social home page Coder Social logo

scalecomputing / hypercoreansiblecollection Goto Github PK

View Code? Open in Web Editor NEW
12.0 12.0 8.0 2.72 MB

Official Ansible collection for Scale Computing SC//HyperCore (HC3) v1 API

License: GNU General Public License v3.0

Makefile 0.17% Python 97.67% Dockerfile 0.06% Shell 1.52% Jinja 0.58%

hypercoreansiblecollection's People

Contributors

alescernivec avatar alisonlhart avatar anazobec avatar bambuco2 avatar ddemlow avatar domendobnikar avatar dradx avatar jtaubenheim avatar juremedvesek avatar justinc1 avatar polonam avatar shoriminimoe avatar tomboscalecomputing avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hypercoreansiblecollection's Issues

:rocket: Feature request: vm_clone module should provide a "preserve mac address" option like HyperCore UI

Is your feature request related to a problem? Please describe.

to simplify cloning of identical virtual machines from latest or roll back to a previous point in time snapshot - it would be helpful to have an option in vm_clone module that is the equivalent of the clone existing mac address option available in the web UI

Describe the solution you'd like

simplify the process of cloning VM or from snapshot while retaining original mac addresses for virtual nics (which may be required to preserve dhcp reservations or to preserve static network configurations for some OS's)

Describe alternatives you've considered

data could be extracted from vm_info, then clone task, then a vm_params task but this could be reduced down to just the clone task and be more declarative

Additional context

test automation using virtual nodes using ansible could leverage this option

image

snapshot_schedule not idempotent

i believe this is a new or recent issue as it seemed this worked correctly at one point (also possible related to hcos version change)

the following play is triggering a change on every execution

scale_computing.hypercore.snapshot_schedule:
  name: snap-daily-midnight
  state: present
  recurrences:
    - name: daily-midnight
      frequency: "FREQ=DAILY;INTERVAL=1"  # RFC-2445
      start: "2010-01-01 00:00:00"
      local_retention: "{{ 7*24*60*60 }}"  # 7 days, unit seconds
      remote_retention: "{{ 1*24*60*60 }}" # optional, None or 0 means same as local_retention.

TASK [schedules : Setup standard snapshot schedule] *************************************************************************************************
task path: /Users/davedemlow/ansible_edge_playbooks/roles/schedules/tasks/main.yml:4
--- before
+++ after
@@ -8,7 +8,7 @@
"remote_retention": 86400,
"replication": true,
"start": "2010-01-01 00:00:00",

  •        "uuid": "92344fd7-3312-40f9-ab17-5dde86963bad"
    
  •        "uuid": "9dba09cb-3196-4075-b8f5-a4a530b6c296"
       }
    
    ],
    "uuid": "4961e233-8200-408c-81f2-82bc1793999f"

hanged: [192.168.1.246] => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/local/bin/python3.10"
},
"changed": true,
"diff": {
"after": {
"name": "snap-daily-midnight",
"recurrences": [
{
"frequency": "FREQ=DAILY;INTERVAL=1",
"local_retention": 604800,
"name": "daily-midnight",
"remote_retention": 86400,
"replication": true,
"start": "2010-01-01 00:00:00",
"uuid": "9dba09cb-3196-4075-b8f5-a4a530b6c296"
}
],
"uuid": "4961e233-8200-408c-81f2-82bc1793999f"
},
"before": {
"name": "snap-daily-midnight",
"recurrences": [
{
"frequency": "FREQ=DAILY;INTERVAL=1",
"local_retention": 604800,
"name": "daily-midnight",
"remote_retention": 86400,
"replication": true,
"start": "2010-01-01 00:00:00",
"uuid": "92344fd7-3312-40f9-ab17-5dde86963bad"
}
],
"uuid": "4961e233-8200-408c-81f2-82bc1793999f"
}
},
"invocation": {
"module_args": {
"cluster_instance": {
"host": "https://192.168.1.246",
"password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER",
"timeout": null,
"username": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER"
},
"name": "snap-daily-midnight",
"recurrences": [
{
"frequency": "FREQ=DAILY;INTERVAL=1",
"local_retention": 604800,
"name": "daily-midnight",
"remote_retention": 86400,
"start": "2010-01-01 00:00:00"
}
],
"state": "present"
}
},
"record": [
{
"name": "snap-daily-midnight",
"recurrences": [
{
"frequency": "FREQ=DAILY;INTERVAL=1",
"local_retention": 604800,
"name": "daily-midnight",
"remote_retention": 86400,
"replication": true,
"start": "2010-01-01 00:00:00",
"uuid": "9dba09cb-3196-4075-b8f5-a4a530b6c296"
}
],
"uuid": "4961e233-8200-408c-81f2-82bc1793999f"
}
]
}
META: role_complete for 192.168.1.246
META: ran handlers
META: ran handlers

PLAY RECAP ******************************************************************************************************************************************
192.168.1.246 : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

attach_guest_tools_iso: true on vm create also requires operatingSystem: field = os_windows_server_2012

currently operatingSystem field is not exposed in module and can't be set (however is in native rest api)

simplest solution would be to add operatingSystem field to vm module and require user to set both attach AND operatingSystem correctly. and thinking about this further - even though it doesn't really change any behavior (beyond the attach_guest_tools) - it is useful for reporting / filtering ... some users may want to depend on this value to be able to store which VM's are running Windows vs. Other (only values the UI allows) ... UI also does allow it to be changed when VM is powered off (likely this is one that UI doesn't enforce)

image

additionally might want to pass operatingsystem = windows value IF attach_guest_tools_iso: true since both are required to meet user desired state?

Information about cluster availability during cluster management modules

There is a cluster unavailability problem during cluster management modules runtime.
Should this be added to documentation of every cluster management module - like we currently have for cluster_instance ?
Something like: "During the execution of this module other cluster management and virtual machine actions are briefly unavailable."

:lady_beetle: Bug: masked password appears in unrelated output

Describe the bug

If the password of the user running a playbook appears in an alert email address, VM name, etc. the portion of output that contains the password is masked.

To Reproduce
Steps to reproduce the behavior:

  1. Create an alert email or VM whose name contains a user's password
  2. Use that user to read the created email/VM using the appropriate ansible module.
  3. The user's password will be masked in the output email/VM name

Expected behavior

The collection should not be masking passwords in unrelated content.

Screenshots

Here is an example playbook. The screenshot below has example output

---
- name: test masked password output
  hosts: all
  connection: local
  gather_facts: false
  environment:
    SC_HOST: "https://{{ inventory_hostname }}"
    SC_USERNAME: "{{ sc_username | default('someuser') }}"
    SC_PASSWORD: "{{ sc_password | default('password') }}"

  tasks:
    - name: "create alert email: [email protected]"
      scale_computing.hypercore.email_alert:
        email: [email protected]
        state: present
      register: alert_email

    - name: "Inspect email output"
      debug:
        msg: "Reported email address: '{{ alert_email.record.email }}'"

    - name: "create VM: the-password-manager-vm"
      scale_computing.hypercore.vm:
        state: present
        vm_name: the-password-manager-vm
        description: The name of this VM contains the test user's password
        tags:
          - mypasswordmanager
        vcpu: 2
        memory: "{{ '512 MB' | human_to_bytes }}"
        disks:
          - type: virtio_disk
            disk_slot: 0
            size: "{{ '100 GB' | human_to_bytes }}"
          - type: ide_cdrom
            disk_slot: 0
        nics:
          - type: virtio
      register: vm_created

    - name: "Inspect VM output"
      debug:
        msg:
          - "New VM name: {{ vm_created.record[0].vm_name }}"
          - "New VM tags: {{ vm_created.record[0].tags }}"
          - "New VM description: {{ vm_created.record[0].description }}"

image

System Info (please complete the following information):

  • HyperCore Version: 9.2+
  • Ansible Version: 2.14.1
  • Collection Version: 1.4.0-dev (commit 0670cdc)

Additional context

This is concerning for a few reasons:

  • The password is masked in contexts that can be cross-referenced from other sources. This could reveal the user's password.
  • An affected user is unable to reference objects by the name reported by the collection because it is masked. Using the example above, attempting to delete the created email address using this task will silently pass because the email address with the masking characters does not exist:
    - name: "Delete created email"
      scale_computing.hypercore.email_alert:
        email: "{{ alert_email.record.email }}"
        state: absent

:lady_beetle: Bug: update_status_check.yml is not surviving upgrade node reboot

Describe the bug

my playbook for multi node cluster updates is using this task to re-use the status check from the version_update_single_node role

  • name: use update check from sns update role #has inner and outer retry loops
    ansible.builtin.import_role:
    name: scale_computing.hypercore.version_update_single_node
    tasks_from: update_status_check.yml

but it appears it's not surviving the time when node is actually down - no update response at all during a reboot - do I need ignore_unreachable: true in above task or some kind of retry there? or should rule be handling this? (note upgrade is still running when error below is thrown)

TASK [hypercore_version : apply desired version to cluster or SNS] *******************************************************************************************************
changed: [veb120a-01.lab.local]
Friday 18 August 2023 07:09:42 -0400 (0:00:04.309) 0:00:25.398 *********

TASK [scale_computing.hypercore.version_update_single_node : Increment version_update_single_node_retry_count] ***********************************************************
ok: [veb120a-01.lab.local]
Friday 18 August 2023 07:09:42 -0400 (0:00:00.063) 0:00:25.462 *********

TASK [scale_computing.hypercore.version_update_single_node : Pause before checking update status - checks will report FAILED-RETRYING until update COMPLETE/TERMINATED] ***
ok: [veb120a-01.lab.local -> localhost]
Friday 18 August 2023 07:10:43 -0400 (0:01:00.866) 0:01:26.329 *********
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (100 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (99 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (98 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (97 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (96 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (95 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (94 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (93 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (92 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (91 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (90 retries left).
FAILED - RETRYING: [veb120a-01.lab.local]: Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED (89 retries left).

TASK [scale_computing.hypercore.version_update_single_node : Check update status - will report FAILED-RETRYING until update COMPLETE/TERMINATED] *************************
fatal: [veb120a-01.lab.local]: FAILED! => {"msg": "The conditional check 'version_update_single_node_update_status.record != None and (\n version_update_single_node_update_status.record.update_status == "COMPLETE" or\n version_update_single_node_update_status.record.update_status == "TERMINATING"\n)' failed. The error was: error while evaluating conditional (version_update_single_node_update_status.record != None and (\n version_update_single_node_update_status.record.update_status == "COMPLETE" or\n version_update_single_node_update_status.record.update_status == "TERMINATING"\n)): 'dict object' has no attribute 'record'"}
...ignoring

PLAY RECAP ***************************************************************************************************************************************************************
veb120a-01.lab.local : ok=13 changed=1 unreachable=0 failed=0 skipped=2 rescued=0 ignored=1

To Reproduce
calling this role https://github.com/ddemlow/ansible_edge_playbooks/blob/master/roles/hypercore_version/tasks/main.yml

Expected behavior

update monitoring should continue through entire cluster update even when node reboots

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

  • OS: [e.g. iOS]
  • HyperCore Version: 9.2
  • Ansible Version:
  • Collection Version current main

Additional context

Add any other context about the problem here.

CI tests on github missed one bug

PR #40 at commit 99f438f - CI tests passed, see https://github.com/ScaleComputing/HyperCoreAnsibleCollection/actions/runs/3248895324/jobs/5330654368.

Same CI tests on gitlab.xlab.si failed - https://gitlab.xlab.si/scale-ansible-collection/HyperCoreAnsibleCollection/-/pipelines/37700 and https://gitlab.xlab.si/scale-ansible-collection/HyperCoreAnsibleCollection/-/jobs/89096

Why? What is wrong with github CI setup?
Possible cause - In log is "WARNING: Unable to determine context for the following test targets, they will be run on the target host: inventory, vm_git_issues"

Also, maybe we are interacting with different HC3 version?

Raise exception within module

Check every module for exception handling.
We should not be checking string error messages => in case formatting is different it wont work.
cluster_shutdown; SMTP; oidc_config

changing tiering_priority on running VM shuts off VM and doesn't turn it back on in one pass

changing tiering_priority on running VM shuts off VM (not necessary as this operation is live) and doesn't turn it back on in one pass - second playbook pass does turn it back on

scale_computing.hypercore.vm_disk:
vm_name: demo1
items:
# - name: CentOS-Stream-9-latest-x86_64-dvd1.iso
# disk_slot: 0
# type: ide_cdrom
- disk_slot: 0
type: virtio_disk
size: "{{ '300 GB' | human_to_bytes }}"
tiering_priority_factor: 1
state: present

Unit test cleanup

Try to implement a .parametrize solution for older unit tests, so we can remove most of them and have a input/output for VM, disks and nics in one place.
Suggestion - check conftest.py, maybe we can add it there and reuse it everywhere.

vm module fails to remove disk

Describe the bug

https://github.com/ScaleComputing/HyperCoreAnsibleCollection/actions/runs/5383096735/jobs/9769417342#step:9:49
vm module failed to remove disk from existing VM.

VM demo-vm was running, and had one extra disk:

https://10.5.11.201/rest/v1/VirDomain/8c6196be-ddb5-4357-9783-50869dc60969
[{"uuid":"8c6196be-ddb5-4357-9783-50869dc60969","nodeUUID":"3dcb0c96-f013-4ccc-b639-33605ea78c44","name":"demo-vm","description":"demo-vm","operatingSystem":"os_other","state":"RUNNING","desiredDisposition":"RUNNING","console":{"type":"VNC","ip":"10.5.11.201","port":5902,"keymap":"en-us"},"mem":1073741824,"numVCPU":2,"blockDevs":[{"uuid":"98d5727b-fef7-4c96-8634-6abd6d5ff6b7","virDomainUUID":"8c6196be-ddb5-4357-9783-50869dc60969","type":"VIRTIO_DISK","cacheMode":"WRITETHROUGH","capacity":10737418240,"allocation":0,"physical":0,"shareUUID":"","path":"scribe/98d5727b-fef7-4c96-8634-6abd6d5ff6b7","slot":0,"name":"","disableSnapshotting":false,"tieringPriorityFactor":0,"mountPoints":[],"createdTimestamp":1681294172,"readOnly":false},{"uuid":"3aad4f9d-71d0-4be7-bd1d-d0ad593d4789","virDomainUUID":"8c6196be-ddb5-4357-9783-50869dc60969","type":"VIRTIO_DISK","cacheMode":"NONE","capacity":645922816,"allocation":645922816,"physical":0,"shareUUID":"","path":"scribe/3aad4f9d-71d0-4be7-bd1d-d0ad593d4789","slot":1,"name":"","disableSnapshotting":false,"tieringPriorityFactor":0,"mountPoints":[],"createdTimestamp":1685248572,"readOnly":false}],"netDevs":[{"uuid":"7b111f0c-77a1-4840-b425-63046e63f995","virDomainUUID":"8c6196be-ddb5-4357-9783-50869dc60969","type":"VIRTIO","macAddress":"7C:4C:58:6B:36:32","vlan":10,"connected":true,"ipv4Addresses":[]}],"stats":[],"created":0,"modified":0,"latestTaskTag":{"taskTag":"36199","progressPercent":0,"state":"ERROR","formattedDescription":"Delete block device %@ for Virtual Machine %@","descriptionParameters":["3aad4f9d","demo-vm"],"formattedMessage":"Unable to delete block device from VM '%@': Still in use","messageParameters":["demo-vm"],"objectUUID":"8c6196be-ddb5-4357-9783-50869dc60969","created":1687925214,"modified":1687925276,"completed":1687925276,"sessionID":"d4fa7269-caa0-4a0a-b5d8-85d8601e93c4","nodeUUIDs":["3dcb0c96-f013-4ccc-b639-33605ea78c44"]},"tags":"Xlab","bootDevices":["98d5727b-fef7-4c96-8634-6abd6d5ff6b7"],"uiState":"RUNNING","snapUUIDs":[],"snapshotSerialNumber":0,"replicationUUIDs":[],"sourceVirDomainUUID":"","snapshotListSerialNumber":0,"snapshotScheduleUUID":"","machineType":"scale-7.2","cpuType":"clusterBaseline-7.3","snapshotAllocationBlocks":0,"guestAgentState":"UNAVAILABLE","lastSeenRunningOnNodeUUID":"3dcb0c96-f013-4ccc-b639-33605ea78c44","isTransient":false,"affinityStrategy":{"strictAffinity":false,"preferredNodeUUID":"3dcb0c96-f013-4ccc-b639-33605ea78c44","backupNodeUUID":""},"vsdUUIDsToDelete":[],"cloudInitData":{"userData":"","metaData":""}}]

https://10.5.11.201/rest/v1/TaskTag/36199
[{"taskTag":"36199","progressPercent":0,"state":"ERROR","formattedDescription":"Delete block device %@ for Virtual Machine %@","descriptionParameters":["3aad4f9d","demo-vm"],"formattedMessage":"Unable to delete block device from VM '%@': Still in use","messageParameters":["demo-vm"],"objectUUID":"8c6196be-ddb5-4357-9783-50869dc60969","created":1687925214,"modified":1687925276,"completed":1687925276,"sessionID":"d4fa7269-caa0-4a0a-b5d8-85d8601e93c4","nodeUUIDs":["3dcb0c96-f013-4ccc-b639-33605ea78c44"]}]

To Reproduce
Steps to reproduce the behavior:

  1. Run prepare-examples.yml to create a new demo-vm
  2. Add extra disk to demo-vm
  3. Start demo-vm
  4. Run prepare-examples.yml again
  5. See error

Expected behavior

vm module should shutdown demo-vm and remove the extra disk.

Also error message should include details - taskTag, formattedDescription, formattedMessage, etc.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

  • HyperCore Version: 9.2.13
  • Ansible Version: 1.15.1
  • Collection Version: branch testing-1.3.0 @ 97b654b

Additional context

vm_disk does not change tiering_priority on first run

My console output

(.venv) justin_cinkelj@jcpc:~/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore$ ansible-playbook -i localhost, examples/dd_a.yml -v
Using /home/justin_cinkelj/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore/ansible.cfg as config file
[WARNING]: running playbook inside collection scale_computing.hypercore

PLAY [Example iso_info module] *********************************************************************************************************************************

TASK [Clone vm security - if not present] **********************************************************************************************************************
changed: [localhost] => changed=true 
  ansible_facts:
    discovered_interpreter_python: /usr/bin/python3
  msg: Virtual machine - ubuntu20_04 - cloning complete to - security-xlab-test.

TASK [Security Vm disk desired configuration] ******************************************************************************************************************
changed: [localhost] => changed=true 
  record:
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 0
    iso_name: ''
    mount_points: []
    read_only: false
    size: 0
    tiering_priority_factor: 0
    type: ide_cdrom
    uuid: 51bb4342-a963-429b-889c-d708304ca43d
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 1
    iso_name: cloud-init-a9a5dbbc.iso
    mount_points: []
    read_only: false
    size: 1048576
    tiering_priority_factor: 0
    type: ide_cdrom
    uuid: 76804ec8-4346-4435-a55e-7559627acbe5
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 0
    iso_name: ''
    mount_points: []
    read_only: false
    size: 53687091200
    tiering_priority_factor: 4
    type: virtio_disk
    uuid: 01c49aa4-d303-440c-9f07-929751a484fb
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 1
    iso_name: ''
    mount_points: []
    read_only: false
    size: 107374182400
    tiering_priority_factor: 1
    type: virtio_disk
    uuid: 2ab0308c-7818-42d6-9c7a-b8f8fe2fc3f8
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  vm_rebooted: false

TASK [Security Vm desired configuration and state] *************************************************************************************************************
changed: [localhost] => changed=true 
  vm_rebooted: false

PLAY RECAP *****************************************************************************************************************************************************
localhost                  : ok=3    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

(.venv) justin_cinkelj@jcpc:~/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore$ 
(.venv) justin_cinkelj@jcpc:~/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore$ 
(.venv) justin_cinkelj@jcpc:~/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore$ 
(.venv) justin_cinkelj@jcpc:~/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore$ ansible-playbook -i localhost, examples/dd_a.yml -v
Using /home/justin_cinkelj/devel/scale-ansible-collection/ansible_collections/scale_computing/hypercore/ansible.cfg as config file
[WARNING]: running playbook inside collection scale_computing.hypercore

PLAY [Example iso_info module] *********************************************************************************************************************************

TASK [Clone vm security - if not present] **********************************************************************************************************************
ok: [localhost] => changed=false 
  ansible_facts:
    discovered_interpreter_python: /usr/bin/python3
  msg: Virtual machine security-xlab-test already exists.

TASK [Security Vm disk desired configuration] ******************************************************************************************************************
changed: [localhost] => changed=true 
  record:
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 0
    iso_name: ''
    mount_points: []
    read_only: false
    size: 0
    tiering_priority_factor: 0
    type: ide_cdrom
    uuid: 51bb4342-a963-429b-889c-d708304ca43d
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 1
    iso_name: cloud-init-a9a5dbbc.iso
    mount_points: []
    read_only: false
    size: 1048576
    tiering_priority_factor: 0
    type: ide_cdrom
    uuid: 76804ec8-4346-4435-a55e-7559627acbe5
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 0
    iso_name: ''
    mount_points: []
    read_only: false
    size: 53687091200
    tiering_priority_factor: 4
    type: virtio_disk
    uuid: 01c49aa4-d303-440c-9f07-929751a484fb
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  - cache_mode: none
    disable_snapshotting: false
    disk_slot: 1
    iso_name: ''
    mount_points: []
    read_only: false
    size: 107374182400
    tiering_priority_factor: 1
    type: virtio_disk
    uuid: 2ab0308c-7818-42d6-9c7a-b8f8fe2fc3f8
    vm_uuid: a9a5dbbc-d96b-48ff-986a-3aaea22e4e42
  vm_rebooted: false

TASK [Security Vm desired configuration and state] *************************************************************************************************************
ok: [localhost] => changed=false 
  vm_rebooted: false

PLAY RECAP *****************************************************************************************************************************************************
localhost                  : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

The playbook examples/dd_a.yml:

---
- name: Example module
  hosts: localhost
  connection: local
  gather_facts: false
  environment:
    MY_VAR: my_value
    # - SC_HOST: https://1.2.3.4
    # - SC_USERNAME: admin
    # - SC_PASSWORD: todo
  vars:
    site_name: xlab-test

  tasks:
  # ------------------------------------------------------
  # begin security vm configurations
    - name: Clone vm security - if not present
      scale_computing.hypercore.vm_clone:
        vm_name: security-{{ site_name }}
        tags:
          - xlab-demo
          - ansible
          - cloudinit
        source_vm_name: ubuntu20_04
        cloud_init:
          user_data: |
            #cloud-config
            password: "password"
            chpasswd: { expire: False }
            ssh_pwauth: True
            apt: {sources: {docker.list: {source: 'deb [arch=amd64] https://download.docker.com/linux/ubuntu $RELEASE stable', keyid: 9DC858229FC7DD38854AE2D88D81803C0EBFCD88}}}
            packages: [qemu-guest-agent, docker-ce, docker-ce-cli, docker-compose, unzip]
            bootcmd:
              - [ sh, -c, 'sudo echo GRUB_CMDLINE_LINUX="nomodeset" >> /etc/default/grub' ]
              - [ sh, -c, 'sudo echo GRUB_GFXPAYLOAD_LINUX="1024x768" >> /etc/default/grub' ]
              - [ sh, -c, 'sudo echo GRUB_DISABLE_LINUX_UUID=true >> /etc/default/grub' ]
              - [ sh, -c, 'sudo update-grub' ]
            runcmd:
              - [ systemctl, restart, --no-block, qemu-guest-agent ]
              - [ curl -s https://api.sc-platform.sc-platform.avassa.net/install | sudo sh -s -- -y -c  ]
            write_files:
            # configure docker daemon to be accessible remotely via TCP on socket 2375
            - content: |
                [Service]
                ExecStart=
                ExecStart=/usr/bin/dockerd -H unix:// -H tcp://0.0.0.0:2375
              path: /etc/systemd/system/docker.service.d/options.conf
          meta_data: |
            dsmode: local
            local-hostname: "security-{{ site_name }}"
      register: security
    #   notify:
    #     - pharmacy-created

    # - name: Flush handlers  #notifies handlers right away instead of at end of playbook
    #   meta: flush_handlers

    - name: Security Vm disk desired configuration
      scale_computing.hypercore.vm_disk:
        vm_name: security-{{ site_name }}
        items:
          - disk_slot: 0
            type: virtio_disk
            size: "{{ '50 GB' | human_to_bytes }}"
            tiering_priority_factor: 4
          - disk_slot: 1
            type: virtio_disk
            size: "{{ '100 GB' | human_to_bytes }}"
            tiering_priority_factor: 1
        state: present

    - name: Security Vm desired configuration and state
      scale_computing.hypercore.vm_params:
        vm_name: security-{{ site_name }}
        memory: "{{ '1 GB' | human_to_bytes }}"
        description: security server for {{ site_name }}
        tags:
          - xlab-demo
          - ansible
          - security
          - "{{ site_name }}"
        vcpu: 2
        power_state: start

On 3rd run it task "Security Vm disk desired configuration" does report changed=false as expected.

And original comment from Dave:

[12:33 PM](https://scalecomputing.slack.com/archives/C03NDHAJWEA/p1664793213941949)
issue above ^ has something to do with having the second disk … if I remove it everything is changed in one pass
  [12:49 PM](https://scalecomputing.slack.com/archives/C03NDHAJWEA/p1664794142801699)
well - seems to be issue only if I have second disk AND setting tiering_priority_factor which I know we are doing some work on… maybe add a test setting that on a second disk?

:lady_beetle: Bug: failure to delete alert emails if there are multiples with the same address

Describe the bug

The collection is unable to delete an alert email if a duplicate is present.

To Reproduce
Steps to reproduce the behavior:

  1. Start with a cluster containing multiple duplicate alert email targets
  2. Attempt to delete one using the email_alert module
  3. Delete fails

Expected behavior

The collection should be able to delete an alert email when there are duplicates

Screenshots

Example with duplicate emails
image

Attempting to delete the email address with the following playbook results in a failure

---
- name: Delete alert email
  hosts: all
  connection: local
  gather_facts: false
  environment:
    SC_HOST: "https://{{ inventory_hostname }}"
    SC_USERNAME: "{{ sc_username }}"
    SC_PASSWORD: "{{ sc_password }}"

  tasks:
    - name: Delete alert email
      scale_computing.hypercore.email_alert:
        state: absent
        email: [email protected]

Failure message:
image

System Info (please complete the following information):

  • OS: Pop!_OS 22.04
  • HyperCore Version: 9.2.22+
  • Ansible Version: 2.14.1
  • Collection Version 1.3.0

Additional context

VM power state not working

There is an issue where vm power state is sometimes not set correctly to start, when certain fields are being set/changed like number of CPUs.

Sample playbook/task:

- name: retail-edge workload deployment playbook
  hosts: region1
  connection: ansible.builtin.local
  gather_facts: False
  tasks:
  # - name: Get cluster VM info
  #   scale_computing.hypercore.vm_info:
  #     cluster_instance:
  #       host: "https://{{inventory_hostname}}"
  #       username: "{{scale_user}}"
  #       password: "{{scale_pass}}"
  #   register: vm_info

  # - name: output the vm_info request results
  #   debug:
  #     var: vm_info   
      
  - name: ubuntu20_04 template - Ubuntu 20.04 - import if not present
    scale_computing.hypercore.vm_import:
      cluster_instance:
        host: "https://{{inventory_hostname}}"
        username: "{{scale_user}}"
        password: "{{scale_pass}}"
      vm_name: ubuntu20_04
      smb:
        server: "{{smbserver}}"
        path: "{{smbpath}}"
        username: "{{smbusername}}"
        password: "{{smbpassword}}"
#    ignore_errors: yes
    register: ubuntu20_04

  - name: protect ubuntu20_04 template from powering on
    scale_computing.hypercore.vm:
      cluster_instance:
        host: "https://{{inventory_hostname}}"
        username: "{{scale_user}}"
        password: "{{scale_pass}}"
      vm_name: ubuntu20_04      
      memory: "{{ '1 GB'|human_to_bytes}}"
      description: "{{inventory_hostname}}"
      tags:
        - ansible
        - template
        - demo
      vcpu: 1 # filed issue that this can't be set to zero here
      state: present
      power_state: stop
      disks:
      - type: virtio_disk
        disk_slot: 0
        size: "{{ '200 GB' | human_to_bytes }}"
      nics:
      - vlan: 0
        type: virtio
      # boot_devices:
      # - type: virtio_disk
      #  disk_slot: 0      
#could set node affinity to no nodes as well

  - name: clone vm demo1 - if not present
    scale_computing.hypercore.vm_clone:
      cluster_instance:   # question - is there a way to define this once per playbook vs. every task?
        host: "https://{{inventory_hostname}}"
        username: "{{scale_user}}"
        password: "{{scale_pass}}"
      vm_name: demo1
      tags:
        - demo
        - ansible
      source_vm_name: ubuntu20_04
      cloud_init:
        user_data: |
          #cloud-config
          apt_update: true
          apt_upgrade: true
          password: "{{scale_pass}}"
          chpasswd: { expire: False }
          ssh_pwauth: True
        meta_data: |
          dsmode: local
          network-interfaces: |
            auto lo
            iface lo inet loopback 
            iface ens3 inet static
              address 192.168.1.200
              netmask 255.255.255.0
              gateway 192.168.1.1
              dns-nameservers 8.8.8.8
          local-hostname: demo1 

#one issue to address - the cloud init clone above attaches an iso image with the cloud init info ... named first 8 char of vm uuid.iso ... not sure how to keep that attached when starting below?

  - name: clone vm demo2 - if not present
    scale_computing.hypercore.vm_clone:
      cluster_instance:
        host: "https://{{inventory_hostname}}"
        username: "{{scale_user}}"
        password: "{{scale_pass}}"
      vm_name: demo2
      tags:
        - demo
        - ansible      
      source_vm_name: ubuntu20_04
      cloud_init:
        user_data: |
          #cloud-config
          apt_update: true
          apt_upgrade: true
          password: "{{scale_pass}}"
          chpasswd: { expire: False }
          ssh_pwauth: True
        meta_data: |
          dsmode: local
          network-interfaces: |
            auto lo
            iface lo inet loopback 
            iface ens3 inet static
              address 192.168.1.201
              netmask 255.255.255.0
              gateway 192.168.1.1
              dns-nameservers 8.8.8.8
          local-hostname: demo2
    
  - name: verify vm demo1 desired configuration  
    scale_computing.hypercore.vm:
      cluster_instance:
        host: "https://{{inventory_hostname}}"
        username: "{{scale_user}}"
        password: "{{scale_pass}}"
      vm_name: demo1
      memory: "{{ '1 GB'|human_to_bytes}}"
      description: "{{inventory_hostname}}"
      vcpu: 2
      state: present
      power_state: start   #possible bug - right now the VM doesn't start first time this playbook executes ... it makes some config changes but doesn't start - if you run again it starts
      disks:
      # - type: ide_cdrom  #see comment above - how do I get the cloud init iso with correct name to attach before starting?
      #   disk_slot: 0
      #   iso_name: cloud-init-3534544e.iso
      - type: virtio_disk
        disk_slot: 0
        size: "{{ '200 GB' | human_to_bytes }}"
      nics:
      - vlan: 0
        type: virtio
      boot_devices:
      - type: virtio_disk
        disk_slot: 0

  - name: verify vm demo2 desired configuration
    scale_computing.hypercore.vm:
      cluster_instance:
        host: "https://{{inventory_hostname}}"
        username: "{{scale_user}}"
        password: "{{scale_pass}}"
      vm_name: demo2
      memory: "{{ '1 GB'|human_to_bytes}}"
      description: "{{inventory_hostname}}"
      vcpu: 2
      state: present
      power_state: start
      disks:
      - type: virtio_disk
        disk_slot: 0
        size: "{{ '200 GB' | human_to_bytes }}"
      nics:
      - vlan: 0
        type: virtio
      boot_devices:
      - type: virtio_disk
        disk_slot: 0

:rocket: Feature request: vm_info should provide data to identify replication target VMs

Is your feature request related to a problem? Please describe.

I would like to be able to identify which VM's are replication targets in order to provide automation like cloning / starting for disaster recovery testing and declaration.

also replication target VM's can't be powered on without first cloning them to another name/UUID

(in virdomain api - these can be identified with presence of sourceVirDomainUUID - which is the UUID of the VM on the replication source cluster - it's name would be the same as vm name on the target cluster)

Describe the solution you'd like

exposing sourceVirDomainUUID itself could be one option or perhaps just expose as a bool - replication_target: true - or both.

Describe alternatives you've considered

current workaround would be to use api module and look for sourceVirDomainUUID

Additional context

Add any other context or screenshots about the feature request here.

example of replication target VMs in UI
image

:lady_beetle: Bug: ISO upload

Describe the bug

Uploading ISO image fails (ISO must not be on server before) with msg: 'Received invalid JSON response: b''success'''

To Reproduce

# remove the ISO 
# main at b3706ce75c1d430372cdccd1ddc8730e26086402
ansible-playbook -i localhost, examples/iso.yml -e iso_remove_old_image=true
TASK [(Optionally) remove existing ISO TinyCore-current.iso from HyperCore] **********************************************************************************
changed: [localhost]

TASK [Upload ISO TinyCore-current.iso to HyperCore] **********************************************************************************************************
fatal: [localhost]: FAILED! => changed=false 
  msg: 'Received invalid JSON response: b''success'''

Bisection - main@ aeb4eab works, main@ b71efe3 does not.

Expected behavior

ISO should be uploaded

System Info (please complete the following information):

  • OS: Fedora 36
  • HyperCore Version: 9.2.13 at https://10.5.11.50/
  • Ansible Version: 2.13.4
  • Collection Version - main branch

Additional context

Failure is at

# plugins/modules/iso.py
     with open(module.params["source"], "rb") as source_file:
         rest_client.put_record(
             endpoint="/rest/v1/ISO/%s/data" % iso_uuid,

Something like https://github.com/ScaleComputing/HyperCoreAnsibleCollection/tree/bug-iso might be a fix.
From memory, ISO (and virtual disks in near future) are the only objects where we upload binary data.
I'm not sure if we can have something nicer than that commit. What I really do not like, is that:

  • we return a TaskTag with invalid values inside. Maybe better None?
  • json.JSONDecodeError was not intercepted, we get json.decoder.JSONDecodeError instead. Might depend on ansible version. Better to except (json.JSONDecodeError, json.decoder.JSONDecodeError): ?

:rocket: Feature request: `vm_snapshot` module should support duplicate label names

Is your feature request related to a problem? Please describe.

The vm_snapshot module does not currently support creating a VM snapshot with a label that already exists.

Describe the solution you'd like

As a user, I should be able to create a VM snapshot with the vm_snapshot module using a label even if a snapshot with that label already exists on the target VM.

Describe alternatives you've considered

This can be done with the API module.

vm module (possibly others) can't set vcpu to 0

to prevent a "template" VM such as a cloud image from being accidentally powered on and booted - rendering it no longer a good template, a workaround has been to set vcpus to 0 using the API ... and then after cloning from the template, set the desired # of vcpu's for the resulting VM. I attempted to use this approach in the playbook and setting the vcpu to 0 would fail

enlarging virtual disk is triggering VM restart cycle

virtual disks can be enlarged while VM's are running in hypercore UI and rest api - collection version 0.1.0 is first powering VM off to enlarge the disk, then powering back on - this operation can be done live. (as can virtual disk create / delete, tiering priority change)

:lady_beetle: Bug: error in scale_computing.hypercore.version_update_single_node : Wait until VMs shutdown

Describe the bug

A clear and concise description of what the bug is.

To Reproduce

playbook is pretty simple (full playbook can be provided_

- name: Update HyperCore single-node system to a desired version
  include_role:
    name: scale_computing.hypercore.version_update_single_node
  vars:
    version_update_single_node_desired_version: "{{ hypercore_desired_version }}"

TASK [scale_computing.hypercore.version_update_single_node : Wait until VMs shutdown] ************************************************************************************************************************
task path: /Users/davedemlow/ansible_collections/scale_computing/hypercore/roles/version_update_single_node/tasks/shutdown_vms.yml:14
Monday 31 July 2023 12:39:50 -0400 (0:00:00.039) 0:00:10.780 ***********
fatal: [192.168.1.246.nip.io]: FAILED! => {
"msg": "Unexpected templating type error occurred on ({{ range(0, (version_update_single_node_shutdown_wait_time / 10.0) | round(0, 'ceil') | int) | list }}): unsupported operand type(s) for /: 'str' and 'float'"
}

I am testing against a single node running 9.2.17.211525 from current "main"

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

  • OS: [e.g. iOS]
  • HyperCore Version:
  • Ansible Version:
  • Collection Version [e.g. 22]

Additional context

Add any other context about the problem here.

:rocket: Feature request: add &name value to support_tunnel modules

Is your feature request related to a problem? Please describe.

beginning in hypercore 9.2.16 and 9.1.24 changes to support tunnels are logged to cluster log along with the logged in user name where applicable.

although ansible uses endpoint that does not require auth - it would be desirable to allow module to pass a user name that would be noted in the cluster log for support tunnel opens and closes... this can be done by adding &name="{{ name }}" to end of uri for example - https://clusterip/support-api/close?&name=dave and support-api/open?code=3112&name=dave

example of cluster logging

image

Describe the solution you'd like

provide an optional parameter to pass a name

Describe alternatives you've considered

Additional context

Add any other context or screenshots about the feature request here.

I have tried also including &name= on older versions and appears it is simply ignored there but still takes desired action - so I don't believe any version check would be required

:flying_saucer: Release version v1.3.0

The new release version is v1.3.0

Release steps

Please complete these steps in order.

  • Update the top-level README.md with links to new modules (run ./docs/helpers/generate_readme_fragment.py)
  • Update the version number in galaxy.yml e.g. 1.1.1
  • Update the changelog with antsibull-changelog e.g. antsibull-changelog release --version 1.1.1
  • Tag the commit with the version number prefixed with 'v' e.g. v1.1.1
  • Deploy new documentation using gh-pages - "Deploy static content to Pages"

Update CI to test HyperCore 9.3.x

HyperCore version 9.3.1 has been released for restricted availability. The CI testing matrix needs to be updated to include this new version.

vm_disk does not report changed=true when ISO is detached

From .../ansible_collections/scale_computing/hypercore/tests/integration/targets/vm_disk/tasks/main.yml:

    - name: Detach ISO image from the disk
      scale_computing.hypercore.vm_disk:
        vm_name: vm-integration-test-disks
        items:
          - disk_slot: 0
            type: ide_cdrom
        state: absent
      register: result
    - ansible.builtin.assert:
        that:
          - result is succeeded
          # - result is changed
          - result.record.0.iso_name == ""

The result should be changed, but it is not.

socket.timeout: The read operation timed out when API response takes too long - impacting integration test

seen randomly on "busy" clusters with more VM's ... most common on import/export operations which have some additional steps (network access to smb server)

from internal scaleUI.log
2022/12/13 15:21:59.907 info rest.js:87 | Path: /VirDomainSnapshotSchedule, Method: GET, Request Body: {}, Status: 200 Response: 0.819s
2022/12/13 15:22:11.625 info rest.js:87 | Path: /VirDomain/37fa1108-126e-4d71-a981-5076b80c1819/export, Method: POST, Request Body: {"target":{"pathURI":"smb://remotedc;administrator:[email protected]/data/integration-test-vm-export","definitionFileName":"my_file.xml"},"template":{}}, Status: 200 Response: 11.711s

resulting in this failure from ansible perspective (the export did actually happen)

TASK [vm_export : Export XLAB-export-test-integration to SMB] *****************************************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: socket.timeout: The read operation timed out
fatal: [testhost]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File "", line 121, in \n File "", line 113, in _ansiballz_main\n File "", line 61, in invoke_module\n File "/usr/lib/python3.8/runpy.py", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File "/usr/lib/python3.8/runpy.py", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File "/usr/lib/python3.8/runpy.py", line 87, in _run_code\n exec(code, run_globals)\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/modules/vm_export.py", line 165, in \n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/modules/vm_export.py", line 158, in main\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/modules/vm_export.py", line 98, in run\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/module_utils/vm.py", line 512, in export_vm\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/module_utils/rest_client.py", line 53, in create_record\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/module_utils/client.py", line 143, in post\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/module_utils/client.py", line 134, in request\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible_collections/scale_computing/hypercore/plugins/module_utils/client.py", line 84, in _request\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible/module_utils/urls.py", line 1446, in open\n File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen\n return opener.open(url, data, timeout)\n File "/usr/lib/python3.8/urllib/request.py", line 525, in open\n response = self._open(req, data)\n File "/usr/lib/python3.8/urllib/request.py", line 542, in _open\n result = self._call_chain(self.handle_open, protocol, protocol +\n File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain\n result = func(*args)\n File "/tmp/ansible_scale_computing.hypercore.vm_export_payload_mp5b3alq/ansible_scale_computing.hypercore.vm_export_payload.zip/ansible/module_utils/urls.py", line 582, in https_open\n File "/usr/lib/python3.8/urllib/request.py", line 1358, in do_open\n r = h.getresponse()\n File "/usr/lib/python3.8/http/client.py", line 1348, in getresponse\n response.begin()\n File "/usr/lib/python3.8/http/client.py", line 316, in begin\n version, status, reason = self._read_status()\n File "/usr/lib/python3.8/http/client.py", line 277, in _read_status\n line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")\n File "/usr/lib/python3.8/socket.py", line 669, in readinto\n return self._sock.recv_into(b)\n File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into\n return self.read(nbytes, buffer)\n File "/usr/lib/python3.8/ssl.py", line 1099, in read\n return self._sslobj.read(len, buffer)\nsocket.timeout: The read operation timed out\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

PLAY RECAP ********************************************************************************************************************************************************************************
testhost : ok=5 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

NOTICE: To resume at this test target, use the option: --start-at vm_export
NOTICE: To resume after this test target, use the option: --start-at vm_git_issues
ERROR: Command "ansible-playbook vm_export-7k9oh07a.yml -i inventory" returned exit status 2.
15:27

this time it worked … 7.9 seconds (again from scaleUI.log)
2/12/13 15:27:17.020 info rest.js:87 | Path: /VirDomain/10baeb4f-aa6f-49ef-a09d-05421cb1f8ea/export, Method: POST, Request Body: {"target":{"pathURI":"smb://remotedc;administrator:[email protected]/data/integration-test-vm-export","definitionFileName":"my_file.xml"},"template":{}}, Status: 200 Response: 7.903s

Create / delete named VM snapshot

As a user I would like to be able to take and delete a named VM snapshot via ansible.

I may for example want to take snapshots before applying VM changes either at hypercore level or inside VM - such as applying updates - and then later delete snapshot after confirming such operation was successful

(See also issue 10 - which would allow cloning from specific named snapshots #10)

clone VM from arbitrary snapshot

As a user, I configure VM replication. Then I have multiple snapshot available on remote replication cluster.

The vm_clone module ATM creates new VM from one of available snapshots (the latest one I believe).

The vm_clone module should support to clone arbitrary fully replicated snapshot. User should be able to select the snapshot.

:lady_beetle: Bug: version_update_single_node role isn't waiting for update to actually start up and concluding it's done

Describe the bug

A clear and concise description of what the bug is.

To Reproduce

https://github.com/ScaleComputing/HyperCoreAnsibleCollection/blob/main/roles/version_update_single_node/tasks/main.yml
needs to wait for update to actually start before initial status check.

probably better solutions available but adding a 60 second wait in my case was sufficient (in some others might need a longer wait or more advanced comparison to tell that update actually is started or at least attempted to start and then terminated)

----------------- UPDATE --------------------

- name: Update single-node system
  scale_computing.hypercore.version_update:
    icos_version: "{{ version_update_single_node_desired_version }}"
  register: version_update_single_node_update_result

- name: wait 60 seconds
  wait_for:
    timeout: 60

- name: Check update status
  ansible.builtin.include_tasks: update_status_check.yml

- name: Show update result
  ansible.builtin.debug:
    var: version_update_single_node_update_result

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

  • OS: [e.g. iOS]
  • HyperCore Version:
  • Ansible Version:
  • Collection Version [e.g. 22]

Additional context

Add any other context about the problem here.

US07 - OIDC Connect Config

Create, list, update and delete OIDC connect config.
Single playbook example to create Azure AD config and set up hypercore cluster.

  • OIDC action module
  • OIDC info module
  • Add documentation fragment
  • Add Example playbook (create Azure AD config and set up cluster)
  • Skipping DELETE action, since it's not implemented on API side + blank values don't work (NOT NULL CONSTRAINT)

add support for VM machine types

i will need to clarify the machine type options available in 9.1 ... bios for sure (and is default) but I believe UEFI is also available even in 9.1 (this is also exposed in UI as boot type when creating VM)

and when 9.2 is released (and in latest internal build 9.2.5) vTPM+UEFI will also become available

from 9.2. rest api docs

machineType stringScale 'Hardware' version. Possible values include:machineTypeDescriptionscale-7.2BIOSscale-8.10UEFIscale-uefi-tpm-9.2vTPM+UEFI (Experiemental) machineType Description scale-7.2 BIOS scale-8.10 UEFI scale-uefi-tpm-9.2 vTPM+UEFI (Experiemental)

scale-7.2 | BIOS
scale-8.10 | UEFI
scale-uefi-tpm-9.2 | vTPM+UEFI (Experiemental)

there may be some related changes beyond just setting machine type such as the NVRAM virtual disk used to store UEFI config - possibly same for vTPM
image

disks: type: ide_cdrom - size should not be required

since the user won't know the size of the uploaded iso and it can't be changed - it's just a complication to require it

but further it appears you can for now just use any arbitrary size other than 0 perhaps

:lady_beetle: Bug: hypercore 9.3 new machine types cause error

Describe the bug

fatal: [192.168.0.230]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/local/bin/python3.10"}, "changed": false, "msg": "Virtual machine: ubtpmcompat-nogpu has an invalid Machine type: scale-uefi-tpm-compatible-9.3."}
[atal: [vlb04a-01.lab.local]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/local/bin/python3.10"}, "changed": false, "msg": "Virtual machine: davetestuefi has an invalid Machine type: scale-uefi-9.3."}

To Reproduce
tested against internal 9.3.0 release with this machine type - simply listing VMs throws error

  • name: List all VMs
    scale_computing.hypercore.vm_info:
    register: all_vms

valid 9.3 machine types
Firmware Version: 9.3.0.212125 ()
Current Logged In Users:
root pts/0 2023-09-12 13:18 (10.100.23.241)
[192.168.18.35 (1a5fb91d) ~ 13:18:50]# sc vmmachinetypes show
scale-bios-9.3
scale-7.2
scale-5.4
scale-8.10
scale-uefi-9.3
scale-uefi-tpm-compatible-9.3
scale-bios-lsi-9.2
scale-uefi-tpm-9.3
scale-6.4
scale-uefi-tpm-9.2

Expected behavior

list virtual machines as is. If possible to allow new / unexpected machine types - at least for viewing - may be more flexible going forward.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

  • OS: [e.g. iOS]
  • HyperCore Version: 9.3.0
  • Ansible Version:
  • Collection Version [e.g. 22]

Additional context

Add any other context about the problem here.

integration test or ci test actions should cleanup replication on target cluster

deleting a replicated VM by design does NOT delete the VM on the target cluster - so ci testing accumulates VM's on the target cluster.

need to create some automation to clean that up.

it would be safe to assume that hypercore cluster credentials are the same on replication target cluster - and that it is accessible from ci runner so that replication target VM could be cleaned up automatically... given that should be able to get everything needed to delete VM on the replication target cluster

track VM replication progress

As a user, I configure VM replication. Then I expect I can recover VM from replica.

In reality, HC3 can take a long time (minutes) before 1st snapshot is fully replicated.
Enhancement: snapshot_info or some similar module should be able to report actual replication progres - maybe:

  • how many snapshots are not yet replicated
  • what is age of most-recent fully replicated snapshot.

:rocket: Feature request: shutdown vm's with specific tags before cluster update

Is your feature request related to a problem? Please describe.

this is somewhat a corner case for now (in some cases testing potential future capabilities) - but I have clusters with specific VM's that can't live migrate to other nodes in the cluster causing cluster rolling updates to fail if they are running. examples include vm's running certain nested virtualization stacks (like nested hypercore nodes), with device passthrough like gpu or usb or could even be if strict affinity is set only allowing vm to run on specific node either due to licensing restrictions or where failover is not required or desired - for example a kubernetes node in multi-node kubernetes cluster

Describe the solution you'd like

some way to automate the shutdown of those VM's before rest of cluster update is initiated

Describe alternatives you've considered

add a task or role (either independent or into existing update roles) that loops through running VM's matching specified tag(or possibly tags?) - goes through shutdown process (initiate apci, wait timeout,force off if needed) - then continues

Additional context

really just looking for help / best practice for including this in existing update roles / examples or creating new example

snapshot_schedule module is not idempotent

Reported by @ddemlow

I added the following task … the schedule gets created and appears fine, but shows changed every time I run the play book.  ideas?

  - name: Setup snapshot schedule
    scale_computing.hypercore.snapshot_schedule:
      name: snap-daily-midnight
      state: present
      recurrences:
        - name: daily-midnight
          frequency: "FREQ=DAILY;INTERVAL=1"  # RFC-2445
          start: "2010-01-01 00:00:00"
          local_retention: "{{ 7*24*60*60 }}"  # 7 days, unit seconds
          remote_retention: "{{ 1*24*60*60 }}" # optional, None or 0 means same as local_retention.

-vvv

TASK [Setup snapshot schedule] ***********************************************************************************************************************
task path: /Users/davedemlow/ansible_collections/Edge_Demo_Playbook.yml:107
changed: [192.168.1.242] => changed=true 
  diff:
    after:
      name: snap-daily-midnight
      recurrences:
      - frequency: FREQ=DAILY;INTERVAL=1
        local_retention: 604800
        name: daily-midnight
        remote_retention: 86400
        replication: true
        start: '2010-01-01 00:00:00'
        uuid: df188982-627f-4a06-a671-64c9d4ca974b
      uuid: 82a9995d-7d8d-4161-8063-241934f2bccc
    before:
      name: snap-daily-midnight
      recurrences:
      - frequency: FREQ=DAILY;INTERVAL=1
        local_retention: 604800
        name: daily-midnight
        remote_retention: 86400
        replication: true
        start: '2010-01-01 00:00:00'
        uuid: 9dad1c64-57ac-4900-b124-1aecacb72dfd
      uuid: 82a9995d-7d8d-4161-8063-241934f2bccc
  invocation:
    module_args:
      cluster_instance:
        host: https://192.168.1.242/
        password: VALUE_SPECIFIED_IN_NO_LOG_PARAMETER
        timeout: null
        username: VALUE_SPECIFIED_IN_NO_LOG_PARAMETER
      name: snap-daily-midnight
      recurrences:
      - frequency: FREQ=DAILY;INTERVAL=1
        local_retention: 604800
        name: daily-midnight
        remote_retention: 86400
        start: '2010-01-01 00:00:00'
      state: present
  record:
  - name: snap-daily-midnight
    recurrences:
    - frequency: FREQ=DAILY;INTERVAL=1
      local_retention: 604800
      name: daily-midnight
      remote_retention: 86400
      replication: true
      start: '2010-01-01 00:00:00'
      uuid: df188982-627f-4a06-a671-64c9d4ca974b
    uuid: 82a9995d-7d8d-4161-8063-241934f2bccc

Seems like integration test for snapshot_schedule lacks idempotence test. We need to to test 2 cases:

  • whole snapshot schedule is not present, module creates it on first invocation, module does not modify snapshot schedule a second time
  • whole snapshot schedule is present, module adds (or removes) a recurrence rule on first invocation, module does not modify snapshot schedule a second time

integration test vm_clone : set XLAB-vm_clone-CI_test-running to power_state running - fatal

TASK [vm_clone : set XLAB-vm_clone-CI_test-running to power_state running] *****
fatal: [testhost]: FAILED! => {"changed": false, "msg": "There was a problem during this task execution."}

this is on a 9.2.8 cluster (runs fine against 9.1.18)

the issue appears that memory is set to an invalid size - playbook sets to mem: 512100100 which in UI displays as
image

vm start fails

image

if I change UI to even number like 488MiB the VM starts fine

vm_replication_info module fails if both snapshot schedule and replication are unconfigured.

To reproduce, create a fresh VM. The snapshot schedule and replication will be both unconfigured. Then invoke vm_replication_info for it - it failed with:

.venv) [justin_cinkelj@jcdell hypercore]$ ansible-playbook -i localhost, -e vm_name=test-tiering examples/vm_replication_info.yml
[WARNING]: running playbook inside collection scale_computing.hypercore

PLAY [Show replication settings for a specific VM] *******************************************************************************************************************************************************************************************

TASK [Get replication info for VM test-tiering] **********************************************************************************************************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: IndexError: list index out of range
fatal: [localhost]: FAILED! => changed=false 
  ansible_facts:
    discovered_interpreter_python: /usr/bin/python3
  module_stderr: |-
    Traceback (most recent call last):
      File "/home/justin_cinkelj/.ansible/tmp/ansible-tmp-1664568208.2079709-14900-75182818547833/AnsiballZ_vm_replication_info.py", line 107, in <module>
        _ansiballz_main()
      File "/home/justin_cinkelj/.ansible/tmp/ansible-tmp-1664568208.2079709-14900-75182818547833/AnsiballZ_vm_replication_info.py", line 99, in _ansiballz_main
        invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
      File "/home/justin_cinkelj/.ansible/tmp/ansible-tmp-1664568208.2079709-14900-75182818547833/AnsiballZ_vm_replication_info.py", line 47, in invoke_module
        runpy.run_module(mod_name='ansible_collections.scale_computing.hypercore.plugins.modules.vm_replication_info', init_globals=dict(_module_fqn='ansible_collections.scale_computing.hypercore.plugins.modules.vm_replication_info', _modlib_path=modlib_path),
      File "/usr/lib64/python3.10/runpy.py", line 209, in run_module
        return _run_module_code(code, init_globals, run_name, mod_spec)
      File "/usr/lib64/python3.10/runpy.py", line 96, in _run_module_code
        _run_code(code, mod_globals, init_globals,
      File "/usr/lib64/python3.10/runpy.py", line 86, in _run_code
        exec(code, run_globals)
      File "/tmp/ansible_scale_computing.hypercore.vm_replication_info_payload_mzqi50e6/ansible_scale_computing.hypercore.vm_replication_info_payload.zip/ansible_collections/scale_computing/hypercore/plugins/modules/vm_replication_info.py", line 110, in <module>
      File "/tmp/ansible_scale_computing.hypercore.vm_replication_info_payload_mzqi50e6/ansible_scale_computing.hypercore.vm_replication_info_payload.zip/ansible_collections/scale_computing/hypercore/plugins/modules/vm_replication_info.py", line 103, in main
      File "/tmp/ansible_scale_computing.hypercore.vm_replication_info_payload_mzqi50e6/ansible_scale_computing.hypercore.vm_replication_info_payload.zip/ansible_collections/scale_computing/hypercore/plugins/modules/vm_replication_info.py", line 76, in run
    IndexError: list index out of range
  module_stdout: ''
  msg: |-
    MODULE FAILURE
    See stdout/stderr for the exact error
  rc: 1

VM with both snapshot schedule and replication set was not a problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.