csmart / ansible-role-virt-infra Goto Github PK
View Code? Open in Web Editor NEWDefine and manage guests and networks on a KVM host with Ansible
License: GNU General Public License v3.0
Define and manage guests and networks on a KVM host with Ansible
License: GNU General Public License v3.0
Hello,
I get this error when I run the commade "./run.sh --limit kvmhost,simple"
my simple.yml:
TASK [ansible-role-virt-infra : Define VM] *****************************************************************************
task path: /appli/virt-infra-ansible/roles/ansible-role-virt-infra/tasks/virt-create.yml:3
skipping: [localhost] => {"changed": false, "skip_reason": "Conditional result was False"}
FAILED - RETRYING: Define VM (10 retries left).
FAILED - RETRYING: Define VM (9 retries left).
FAILED - RETRYING: Define VM (8 retries left).
FAILED - RETRYING: Define VM (7 retries left).
FAILED - RETRYING: Define VM (6 retries left).
FAILED - RETRYING: Define VM (5 retries left).
FAILED - RETRYING: Define VM (4 retries left).
FAILED - RETRYING: Define VM (3 retries left).
FAILED - RETRYING: Define VM (2 retries left).
FAILED - RETRYING: Define VM (1 retries left).
fatal: [simple-centos-8-1 -> localhost]: FAILED! => {"attempts": 10, "changed": true, "cmd": "set -o pipefail && virt-install --import --connect qemu:///system --cpu host-passthrough --controller type=scsi,model=virtio-scsi,index=0 --disk /opt/kvm/images/simple-centos-8-1-boot.qcow2,serial=boot,boot_order=1,format=qcow2,bus=scsi --disk /opt/kvm/images/simple-centos-8-1-cloudinit.iso,device=cdrom,bus=scsi,format=iso --channel unix,target_type=virtio,name=org.qemu.guest_agent.0 --graphics spice --machine q35 --name simple-centos-8-1 --network network=default,mac=52:54:00:74:46:75,model=virtio --noreboot --noautoconsole # --events on_poweroff=preserve,on_reboot=restart --os-type linux # --memory 1024,maxmemory=1024 --memory 2048 --rng /dev/urandom --serial pty --sound none --vcpus 1,maxvcpus=1 --virt-type kvm\n", "delta": "0:00:01.059934", "end": "2022-02-13 21:02:16.343805", "msg": "non-zero return code", "rc": 1, "start": "2022-02-13 21:02:15.283871", "stderr": "ERROR \n--memory amount in MiB is required", "stderr_lines": ["ERROR ", "--memory amount in MiB is required"], "stdout": "", "stdout_lines": []}
Regards
It would be good to be able to define specific network settings where necessary and have cloud-init configure them. This probably means enabling support for network config v2 in cloud-init.
In the inventory, one could add additional configuration settings to a VM's network. We already have mac
so we could add others like IP address, etc.
When using limit
to a hypervirsor other than the first one in the kvmhost
list in inventory, the validation check fails.
Rather than setting an arbitrary default of 20G, set the root disk to the size of the source disk.
Default path is PATH=/sbin:/bin:/usr/sbin:/usr/bin
and so vbmc commands fail.
failed: [localhost] (item=testing-opensuse-15) => {"ansible_loop_var": "item", "attempts": 10, "changed": true, "cmd": "vbmc add testing-opensuse-15 --libvirt-uri qemu:///system --port 62331 --username admin --password password\n", "delta": "0:00:00.001564", "end": "2020-08-16 15:48:12.140335", "item": "testing-opensuse-15", "msg": "non-zero return code", "rc": 127, "start": "2020-08-16 15:48:12.138771", "stderr": "/bin/bash: vbmc: command not found", "stderr_lines": ["/bin/bash: vbmc: command not found"], "stdout": "", "stdout_lines": []}
We might need to get the path of vbmc and call it directly, or modify the path when running vbmc commands.
It'd be good if we can specify a range of VLANs with OVS, as we might want a trunk so that any number of VMs can be on that network and send tagged traffic.
If we have a base image that already has cloud-init and qemu-guest-agent installed, there's no need to do that task. I've also noticed openSUSE also often fails this step even though they are already installed, when trying to update cache as it wants to update them or something. Add a way to skip these on a per-VM basis.
Failed to cache repo (1).", "Warning: Skipping repository 'Update Repository (Non-Oss)' because of the above error.", "Some of the repositories have not been refreshed because of an error.", "Loading repository data...", "Reading installed packages...", "'cloud-init' is already installed.", "No update candidate for 'cloud-init-19.4-lp151.2.18.1.x86_64'. The highest available version is already installed.", "'qemu-guest-agent' is already installed.", "No update candidate for 'qemu-guest-agent-3.1.1.1-lp151.7.12.1.x86_64'. The highest available version is already installed.", "Resolving package dependencies...", "", "Nothing to do.",
Some guests don't need a working boot disk. Guests like for those for OpenStack TripleO also need their first NIC to be controlled by the undercloud which will set them to PXE as part of introspection process. Thus, this should support a way to have skeleton guests, where we can ensure they are defined and started OK, but don't need to validate that they boot and get their IP address, etc.
Currently we only create or delete, but if we want to create and it already exists, then we could maybe use modify
state instead to be able to make changes to the networks on the fly with ansible.
When setting ubuntu Guests (tested with 20.04), guest's hostname are not properly set:
ansible 4.4.0 (core 2.11.4)
tests.yml :
tests:
hosts:
# simple-centos-7-[0:2]:
# ansible_python_interpreter: /usr/bin/python
test-[0:1]:
ansible_python_interpreter: /usr/bin/python
virt_infra_state: running
virt_infra_distro: ubuntu
virt_infra_distro_image: ubuntu-20.04-server-cloudimg-amd64.img
vars:
running tests.yml:
ansible-playbook virt-infra.yml --limit kvmhost,tests
/etc/host
is correctly modified, but /etc/hostname
should also be modified (ref: https://askubuntu.com/a/1343976).
By the way, great and impressive work!
vbmc lets you add a port > 65536 but you can't start it... so add a simple validation check
Information I'd like to see per VM:
1. VBMC IP, port and credentials
2. Redfish URL, and credentials
3. all MAC addresses in order of interfaces, possibly labeled eth0, eth1, etc.. Real name will depend on OS
4. storage information, disk and size
Hi!
Any chance to support hard disk image in raw format besides qcow?
When creating new ~/.ssh/config and ~/.ssh/known_hosts, the files are created with permission 0755.
Permission should be 0600.
$ ls -l
total 32
-rw------- 1 wyntre wyntre 1195 May 4 16:31 authorized_keys
-rwxr-xr-x 1 wyntre wyntre 717 May 5 20:09 config
-rw------- 1 wyntre wyntre 1679 Mar 19 08:46 id_rsa
-rw-r--r-- 1 wyntre wyntre 395 Mar 19 08:46 id_rsa.pub
-rwxr-xr-x 1 wyntre wyntre 15792 May 5 20:09 known_hosts
Make it possible to have a source when creating data disks, this can be useful for sharing sets of data between VMs.
VMs that have nvram defined (outside of this role, for example sushy tools) fail to be undefined. Simply add another check to remove nvram for any VM before undefine to fix this.
Variables virt_infra_architecture
and virt_infra_domainname
are checked and defined only in file defaults-set.yml.
Is this used somewhere else later in pipeline, or can this file be safely removed?
Loading distro specific vars, e.g. vars/centos.yml
overrides any existing vars that have been set on the command line or as a part of the host inventory. For example, virt_infra_security_driver
. Ideally these should only be loaded if a hostvar doesn't exist.
[16:25 csmart ~]$ virt-install
ERROR
--os-variant/--osinfo OS name is required, but no value was
set or detected.
This is now a fatal error. Specifying an OS name is required
for modern, performant, and secure virtual machine defaults.
You can see a full list of possible OS name values with:
virt-install --osinfo list
If your Linux distro is not listed, try one of generic values
such as: linux2020, linux2018, linux2016
If you just need to get the old behavior back, you can use:
--osinfo detect=on,require=off
Or export VIRTINSTALL_OSINFO_DISABLE_REQUIRE=1
Hi,
I'm trying to use the role to setup kvm + some vms (as it is intended).
I'm using a REMOTE Host as target which is connected by a user "ansible", while i am logged in as user "USER"
ansible_user: ansible
$ whoami
USER
Add Host to SSH config fails
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: PermissionError: [Errno 13] Permission denied: b'/home/ansible/.ansible/tmp/ansible-moduletmp-1610370013.9305496-fk8n7156/tmp048j2a7t' -> b'/home/USER/.ssh/config'
failed: [remotehost -> x.x.x.x] (item=example-ubuntu-focal) => {"ansible_loop_var": "item", "changed": false, "item": "example-ubuntu-focal", "msg": "The destination directory (/home/USER/.ssh) is not writable by the current user. Error was: [Errno 13] Permission denied: b'/home/USER/.ssh/.ansible_tmpgqbw_lcnconfig'"}
this permission issue is for roles/ansible-role-virt-infra/tasks/wait.yml
(ssh-fingerprint) and roles/ansible-role-virt-infra/tasks/hosts_add.yml
(ssh-key addition)
This seems to be fixable easily by changing become: false
to become: true
for the add host task. while this is easy for me i am not quite sure if the result is wanted, or if the lookup has to be changed to use the "ansible_user" and the homedir of that user.
while adding the key to the known hosts, the owner also must get set.
Clearing a list appears to be a jinja2 >= 2.11 feature, so it fails on other distros.
TASK [ansible-role-virt-infra : Define VM] ************************************************************************************************************************************************************************
skipping: [localhost]
fatal: [ceph-0]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'list object' has no attribute 'clear'\n\nThe error appears to be in '/home/csmart/virt-infra-ansible/roles/ansible-role-virt-infra/tasks/virt-create.yml': line 3, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# Define the VM, unless it doesn't yet exist and is set to \"undefined\"\n- name: Define VM\n ^ here\n"}
Find another way to do this, perhaps pop in a loop or something.
On newer versions of virt-install (e.g. Fedora 36):
Starting install...
ERROR unsupported configuration: qemu driver doesn't support the 'preserve' action for 'on_reboot'/'on_poweroff'
SSH fail to start on CentOS Stream 8 VMs, this seems to be because of two things:
The sshkey-gen.target
is not run because cloud-init
is enabled:
[root@swift-01 ~]# cat /etc/systemd/system/sshd-keygen\@.service.d/disable-sshd-keygen-if-cloud-init-active.conf
# In some cloud-init enabled images the sshd-keygen template service may race
# with cloud-init during boot causing issues with host key generation. This
# drop-in config adds a condition to [email protected] if it exists and
# prevents the sshd-keygen units from running *if* cloud-init is going to run.
#
[Unit]
ConditionPathExists=!/run/systemd/generator.early/multi-user.target.wants/cloud-init.target
But cloud-init
isn't actually creating the host keys because of a config change:
[root@swift-01 ~]# diff -Nurd /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmnew
--- /etc/cloud/cloud.cfg 2021-06-03 18:10:45.162000000 +1000
+++ /etc/cloud/cloud.cfg.rpmnew 2022-04-30 17:06:20.000000000 +1000
@@ -7,7 +7,7 @@
mount_default_fields: [~, ~, 'auto', 'defaults,nofail,x-systemd.requires=cloud-init.service', '0', '2']
resize_rootfs_tmp: /dev
ssh_deletekeys: 1
-ssh_genkeytypes: ~
+ssh_genkeytypes: ['rsa', 'ecdsa', 'ed25519']
syslog_fix_perms: ~
disable_vmware_customization: false
@@ -54,7 +54,7 @@
system_info:
default_user:
- name: centos
+ name: cloud-user
lock_passwd: true
gecos: Cloud User
groups: [adm, systemd-journal]
This is because cloud-init
was actually updated, as a part of the disk prep step to make sure it is installed:
https://github.com/csmart/ansible-role-virt-infra/blob/master/tasks/disk-create.yml#L202
Therefore, we need to either get smarter about installing cloud-init
not install the latest version of it (but it's probably good to install the latest version), or we need to make sure that the config wasn't changed since the initial RPM was installed (so that the config is overwritten and .rpmnew
is never created), or we need to add a post install task to run commands in the disk (so that users can manage rpmconf
)... Something like that.
Not all VMs might want to be backed, so support an option to copy them. This will also help with support for RAW in #38
I prefer to not create these entries, since I have an internal DNS server that takes care of DHCP and will take care of resolving the hostnames appropriately when the guest comes online.
I will add two flags to make this optional (default to true, as is the current behavior)
Specify a disk as shared so that it can be attached to multiple VMs. This will mean that shared option needs to be set (to enable correct SELinux labeling) and also that the name must not be based on the inventory hostname as it won't then be the same across VMs.
Hi again :-),
i have a inventory with multiple kvmhosts
kvmhost:
hosts:
host0:
[...]
host1:
[...]
they are also member of another inventory file, for the "which" (Following guidelines @ https://docs.ansible.com/ansible/latest/user_guide/intro_patterns.html#intro-patterns)
prod:
hosts:
host1:
dev:
hosts:
host0:
this results in errors, if i use the role for the prod environment since there is a ton of
hostvars[groups['kvmhost'][0]]
lookups throughout the role and kvmhost[0] is host0. which just does not exist in dev.
This also invalidates the checks for any multi kvmhost scenario.
I think there should be very small issues when removing most of the lookups and just use the named facts you register.
Since this makes a whopping 140 occurences of that pattern i am not sure if the role was ever intended to do multiple kvm hosts and if a PR implementing this would be welcomed?
As part of adding support for IPv6, it is desirable to add also support for forward mode="route" since IPv6 does not require NAT and, in this case, the user will be probably interested on formard mode="route".
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.