Coder Social home page Coder Social logo

ibm / ibm-spectrum-scale-install-infra Goto Github PK

View Code? Open in Web Editor NEW
63.0 16.0 68.0 1.59 MB

Spectrum Scale Installation and Configuration

License: Apache License 2.0

Python 94.92% Jinja 3.66% Shell 1.42%
spectrum-scale gpfs ansible ansible-role ibm-spectrum-scale ibm-storage-scale storage-scale

ibm-spectrum-scale-install-infra's Introduction

Important: You are viewing the main branch of this repository. If you've previously used the master branch in your own playbooks then you will need to make some changes in order to switch to the main branch. See MIGRATING.md for details.


IBM Storage Scale (GPFS) Deployment using Ansible Roles

Ansible project with multiple roles for installing and configuring IBM Storage Scale (GPFS) software defined storage.

Table of Contents

Features

Infrastructure minimal tested configuration

  • Pre-built infrastructure (using a static inventory file)
  • Dynamic inventory file

OS support

  • Support for RHEL 7 on x86_64, PPC64 and PPC64LE
  • Support for RHEL 8 on x86_64 and PPC64LE
  • Support for UBUNTU 20 on x86_64 and PPC64LE
  • Support for SLES 15 on x86_64 and PPC64LE

Common prerequisites

  • Disable SELinux (scale_prepare_disable_selinux: true), by default false
  • Disable firewall (scale_prepare_disable_firewall: true), by default false.
  • Install and start NTP
  • Create /etc/hosts mappings
  • Open firewall ports
  • Generate SSH keys
  • User must set up base OS repositories

Core IBM Storage Scale prerequisites

  • Install yum-utils package
  • Install gcc-c++, kernel-devel, make
  • Install elfutils,elfutils-devel (RHEL8 specific)

Core IBM Storage Scale Cluster features

  • Install core IBM Storage Scale packages on Linux nodes
  • Install IBM Storage Scale license package on Linux nodes
  • Compile or install pre-compiled Linux kernel extension (mmbuildgpl)
  • Configure client and server license
  • Assign default quorum (maximum 7 quorum nodes) if user has not defined in the inventory
  • Assign default manager nodes (all nodes will act as manager nodes) if user has not defined in the inventory
  • Create new cluster (mmcrcluster -N /var/mmfs/tmp/NodeFile -C {{ scale_cluster_clustername }})
  • Create cluster with profiles
  • Create cluster with daemon and admin network
  • Add new node into existing cluster
  • Configure node classes
  • Define configuration parameters based on node classes
  • Configure NSDs and file system
  • Configure NSDs without file system
  • Add NSDs
  • Add disks to existing file system

IBM Storage Scale Management GUI features

  • Install IBM Storage Scale management GUI packages on designated GUI nodes
  • Maximum 3 GUI nodes to be configured
  • Install performance monitoring sensor packages on all Linux nodes
  • Install performance monitoring collector on all designated GUI nodes
  • Configure performance monitoring and collectors
  • Configure HA federated mode collectors

IBM Storage Scale Call Home features

  • Install IBM Storage Scale Call Home packages on all cluster nodes
  • Configure Call Home

IBM Storage Scale CES (SMB and NFS) Protocol supported features

  • Install IBM Storage Scale SMB or NFS on selected cluster nodes (5.0.5.2 and above)
  • Install IBM Storage Scale Object on selected cluster nodes (5.1.1.0 and above)
  • CES IPV4 or IPV6 support
  • CES interface mode support

Minimal tested Versions

The following Ansible versions are tested:

The following IBM Storage Scale versions are tested:

  • 5.0.4.0 and above
  • 5.0.5.2 and above for CES (SMB and NFS)
  • 5.1.1.0 and above for CES (Object)
  • Refer to the Release Notes for details

Specific OS requirements:

  • For CES (SMB/NFS) on SLES15: Python 3 is required.
  • For CES (Object): RhedHat 8.x is required.

Prerequisites

Users need to have a basic understanding of the Ansible concepts for being able to follow these instructions. Refer to the Ansible User Guide if this is new to you.

  • Install Ansible on any machine (control node)

    $ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
    $ python get-pip.py
    $ pip3 install ansible==2.9

    Refer to the Ansible Installation Guide for detailed installation instructions.

    Note that Python 3 is required for certain functionality of this project to work. Ansible should automatically detect and use Python 3 on managed machines, refer to the Ansible documentation for details and workarounds.

  • Download IBM Storage Scale packages

  • Create password-less SSH keys between all nodes in the cluster

    A pre-requisite for installing IBM Storage Scale is that password-less SSH must be configured among all nodes in the cluster. Password-less SSH must be configured and verified with FQDN, hostname, and IP of every node to every node.

    Example:

    $ ssh-keygen
    $ ssh-copy-id -oStrictHostKeyChecking=no node1.gpfs.net
    $ ssh-copy-id -oStrictHostKeyChecking=no node1
    $ ssh-copy-id -oStrictHostKeyChecking=no

    Repeat this process for all nodes to themselves and to all other nodes.

Installation Instructions

  • Create project directory on Ansible control node

    The preferred way of accessing the roles provided by this project is by placing them inside the collections/ansible_collections/ibm/spectrum_scale directory of your project, adjacent to your Ansible playbook. Simply clone the repository to the correct path:

    $ mkdir my_project
    $ cd my_project
    $ git clone -b main https://github.com/IBM/ibm-spectrum-scale-install-infra.git collections/ansible_collections/ibm/spectrum_scale

    Be sure to clone the project under the correct subdirectory:

    my_project/
    ├── collections/
    │   └── ansible_collections/
    │       └── ibm/
    │           └── spectrum_scale/
    │               └── ...
    ├── hosts
    └── playbook.yml
  • Create Ansible inventory

    Define IBM Storage Scale nodes in the Ansible inventory (e.g. hosts) in the following format:

    # hosts:
    [cluster01]
    scale01  scale_cluster_quorum=true   scale_cluster_manager=true
    scale02  scale_cluster_quorum=true   scale_cluster_manager=true
    scale03  scale_cluster_quorum=true   scale_cluster_manager=false
    scale04  scale_cluster_quorum=false  scale_cluster_manager=false
    scale05  scale_cluster_quorum=false  scale_cluster_manager=false

    The above is just a minimal example. It defines Ansible variables directly in the inventory. There are other ways to define variables, such as host variables and group variables.

    Numerous variables are available which can be defined in either way to customize the behavior of the roles. Refer to VARIABLES.md for a full list of all supported configuration options.

  • Create Ansible playbook

    The basic Ansible playbook (e.g. playbook.yml) looks as follows:

    # playbook.yml:
    ---
    - hosts: cluster01
      collections:
        - ibm.spectrum_scale
      vars:
        - scale_install_localpkg_path: /path/to/Spectrum_Scale_Standard-5.0.4.0-x86_64-Linux-install
      roles:
        - core_prepare
        - core_install
        - core_configure
        - core_verify

    Again, this is just a minimal example. There are different installation methods available, each offering a specific set of options:

    Refer to VARIABLES.md for a full list of all supported configuration options.

  • Run the playbook to install and configure the IBM Storage Scale cluster

    • Using the ansible-playbook command:

      $ ansible-playbook -i hosts playbook.yml
    • Using the automation script:

      $ cd samples/
      $ ./ansible.sh

      Note: An advantage of using the automation script is that it will generate log files based on the date and the time in the /tmp directory.

  • Playbook execution screen

    Playbook execution starts here:

    $ ./ansible.sh
    Running #### ansible-playbook -i hosts playbook.yml
    
    PLAY #### [cluster01]
    **********************************************************************************************************
    
    TASK #### [Gathering Facts]
    **********************************************************************************************************
    ok: [scale01]
    ok: [scale02]
    ok: [scale03]
    ok: [scale04]
    ok: [scale05]
    
    TASK [common : check | Check Spectrum Scale version]
    *********************************************************************************************************
    ok: [scale01]
    ok: [scale02]
    ok: [scale03]
    ok: [scale04]
    ok: [scale05]
    
    ...

    Playbook recap:

    #### PLAY RECAP
    ***************************************************************************************************************
    scale01                 : ok=0   changed=65    unreachable=0    failed=0    skipped=0   rescued=0    ignored=0
    scale02                 : ok=0   changed=59    unreachable=0    failed=0    skipped=0   rescued=0    ignored=0
    scale03                 : ok=0   changed=59    unreachable=0    failed=0    skipped=0   rescued=0    ignored=0
    scale04                 : ok=0   changed=59    unreachable=0    failed=0    skipped=0   rescued=0    ignored=0
    scale05                 : ok=0   changed=59    unreachable=0    failed=0    skipped=0   rescued=0    ignored=0

Optional Role Variables

Users can define variables to override default values and customize behavior of the roles. Refer to VARIABLES.md for a full list of all supported configuration options.

Additional functionality can be enabled by defining further variables. Browse the examples in the samples/ directory to learn how to:

Available Roles

The following roles are available for you to reuse when assembling your own playbook:

  • Core GPFS (roles/core_*)*
  • GUI (roles/gui_*)
  • SMB (roles/smb_*)
  • NFS (roles/nfs_*)
  • Object (roles/obj_*)
  • HDFS (roles/hdfs_*)
  • Call Home (roles/callhome_*)
  • File Audit Logging (roles/fal_*)
  • ...

Note that Core GPFS is the only mandatory role, all other roles are optional. Each of the optional roles requires additional configuration variables. Browse the examples in the samples/ directory to learn how to:

Cluster Membership

All hosts in the play are configured as nodes in the same IBM Storage Scale cluster. If you want to add hosts to an existing cluster then add at least one node from that existing cluster to the play.

You can create multiple clusters by running multiple plays. Note that you will need to reload the inventory to clear dynamic groups added by the IBM Storage Scale roles:

- name: Create one cluster
  hosts: cluster01
  roles: ...

- name: Refresh inventory to clear dynamic groups
  hosts: localhost
  connection: local
  gather_facts: false
  tasks:
    - meta: refresh_inventory

- name: Create another cluster
  hosts: cluster02
  roles: ...

Limitations

The roles in this project can (currently) be used to create new clusters or extend existing clusters. Similarly, new file systems can be created or extended. But this project does not remove existing nodes, disks, file systems or node classes. This is done on purpose — and this is also the reason why it can not be used, for example, to change the file system pool of a disk. Changing the pool requires you to remove and then re-add the disk from a file system, which is not currently in the scope of this project.

Furthermore, upgrades are not currently in scope of this role. IBM Storage Scale supports rolling online upgrades (by taking down one node at a time), but this requires careful planning and monitoring and might require manual intervention in case of unforeseen problems.

Troubleshooting

The roles in this project store configuration files in /var/mmfs/tmp on the first host in the play. These configuration files are kept to determine if definitions have changed since the previous run, and to decide if it's necessary to run certain IBM Storage Scale commands (again). When experiencing problems one can simply delete these configuration files from /var/mmfs/tmp in order to clear the cache — doing so forces re-application of all definitions upon the next run. As a downside, the next run may take longer than expected as it might re-run unnecessary IBM Storage Scale commands. This will automatically re-generate the cache.

Reporting Issues and Feedback

Please use the issue tracker to ask questions, report bugs and request features.

Contributing Code

We welcome contributions to this project, see CONTRIBUTING.md for more details.

Disclaimer

Please note: all roles / playbooks / modules / resources in this repository are released for use "AS IS" without any warranties of any kind, including, but not limited to their installation, use, or performance. We are not responsible for any damage or charges or data loss incurred with their use. You are responsible for reviewing and testing any scripts you run thoroughly before use in any production environment. This content is subject to change without notice.

Copyright and License

Copyright IBM Corporation, released under the terms of the Apache License 2.0.

ibm-spectrum-scale-install-infra's People

Contributors

acch avatar azaelrguez avatar benformosa avatar christop1964 avatar dheren-git avatar janfrode avatar mamuthiah avatar olemyk avatar rajan-mis avatar sasikeda avatar shubh1410 avatar stevemar avatar sujeetkjha avatar troppens avatar whowutwut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ibm-spectrum-scale-install-infra's Issues

Single role to call that will call other roles in the playbook, to ensure consistency

Looks like right now, the playbook.yml file has a list of roles that need to be added or removed by the user, meaning we are giving execution control to the end user, and I think that may result in lots of user error..

  • What happens if the end user does not execute the roles in the right order? Is that OK? For example, running gui roles before core roles. (GPFS will not be installed.. so i think that's NOT OK)

There also seems to be logic in the hosts file, that rely on the playbook calling the roles... From https://github.com/IBM/ibm-spectrum-scale-install-infra/blob/master/hosts

# hosts:
[cluster01]
host-vm1 scale_cluster_quorum=true   scale_cluster_manager=true scale_cluster_gui=false
host-vm3 scale_cluster_quorum=true   scale_cluster_manager=true scale_cluster_gui=false

If we wanted the playbook to run GUI configuration, we have to set scale_cluster_gui=true, make sense.. and easy... but the user also has to add:

     - gui/node
     - gui/cluster
     - gui/postcheck

Or nothing will happen. Seems to be like potential for user error

sshd key configuration, selinux, firewalld, should not be true by default

Currently in roles/core/precheck/tasks/prepare.yml, we have a few defaults that are set to true out of the box. I think the default behavior should be false and only if the customer/admin wants us to do those pieces, they set it true, or override the defaults.

I believe the reasons it should be false is that we should not do things to make security worse, by default, like disabling firewall and/or SELinux. We only do those if requested. SSH keys also may already be configured, so generating keys should only be done in environments where we know we need them (if requested)

Add check memory requirements (Daemon failed to initialize fast condvar rc -1)

The Ansible tasks and the manual start of mmfsd fails, if a node has insufficient memory. The Spectrum Scale FAQ says that at least 2GB memory are required. Though working in virtual environments the default memory size of VMs is quite often smaller, in particular in proof-of-concept and demo environments. This might be in particular difficult for a new user who tries the Developer Edition on a laptop with limited resources. I therefore recommend to add a check to the Ansible roles and ideally also to improve the output of mmstartup.

Doing a lot of proof-of-concepts in virtual environments I came across many variations of low memory symptoms. Here is an example on a VM with 256MB only:

TASK [spectrum_scale_core/cluster : cluster | Start daemons] ******************************************************************************************************************************************************
changed: [spectrumscale]

RUNNING HANDLER [spectrum_scale_core/cluster : wait-daemon-active] ************************************************************************************************************************************************
FAILED - RETRYING: wait-daemon-active (10 retries left).
FAILED - RETRYING: wait-daemon-active (9 retries left).
FAILED - RETRYING: wait-daemon-active (8 retries left).
FAILED - RETRYING: wait-daemon-active (7 retries left).
FAILED - RETRYING: wait-daemon-active (6 retries left).
FAILED - RETRYING: wait-daemon-active (5 retries left).
FAILED - RETRYING: wait-daemon-active (4 retries left).
FAILED - RETRYING: wait-daemon-active (3 retries left).
FAILED - RETRYING: wait-daemon-active (2 retries left).
FAILED - RETRYING: wait-daemon-active (1 retries left).
fatal: [spectrumscale]: FAILED! => {"attempts": 10, "changed": false, "cmd": "/usr/lpp/mmfs/bin/mmgetstate -N localhost -Y | grep -v HEADER | cut -d ':' -f 9", "delta": "0:00:01.551573", "end": "2020-04-17 22:35:05.195691", "rc": 0, "start": "2020-04-17 22:35:03.644118", "stderr": "", "stderr_lines": [], "stdout": "down", "stdout_lines": ["down"]}

NO MORE HOSTS LEFT ************************************************************************************************************************************************************************************************

PLAY RECAP ********************************************************************************************************************************************************************************************************
spectrumscale              : ok=81   changed=22   unreachable=0    failed=1    skipped=45   rescued=0    ignored=0

[root@origin ansible]# 

So, let's check on the target node:

[root@origin ansible]# ssh spectrumscale
Last login: Fri Apr 17 22:35:03 2020 from 10.1.1.10

[root@spectrumscale ~]# mmgetstate -a

 Node number  Node name        GPFS state
-------------------------------------------
       1      spectrumscale    down

[root@spectrumscale ~]# mmstartup
Fri Apr 17 22:35:38 CEST 2020: mmstartup: Starting GPFS ...

[root@spectrumscale ~]# mmgetstate -a

 Node number  Node name        GPFS state
-------------------------------------------
       1      spectrumscale    down

[root@spectrumscale ~]#

There are hints in mmfs.log:

2020-04-17_22:29:24.629+0200: [E] Daemon failed to initialize fast condvar rc -1

And in /var/log/messages:

Apr 17 20:35:49 spectrumscale kernel: [E] GPFS cxiInitFastCondvar: kmalloc failed allocating MAX_GPFS_THREADS 16384 * sizeof(FastCondvarThread_t) 64 = 1048576 bytes for FastCondvarThread_t
Apr 17 20:35:49 spectrumscale mmfs[15350]: [E] Daemon failed to initialize fast condvar rc -1
Apr 17 20:35:50 spectrumscale mmremote[15384]: Shutting down!

For quick evaluations you may want to work with VMs which have less than 2GB of memory. Therefore it would be good to add a warning (not an error) to the the Ansible role if the memory is below 2GB and enhance mmstartup to issue an error, if the allocation of memory fails.

Nodes need more than 2GB, depending on the configuration of pagepool for instance, so 2GB is not sufficient for many environments. However a check for 2GB is better than no check at all. The formula for the check can be improved later.

explore error message suppression and only output 'mm' command failures

Looking to improve error messaging usability and problem determination by suppressing extra error messages and only posting the 'mm' command output.

Example in current Ansible run:

changed: [node-vm1] => (item={'diff': [], 'dest': '/var/tmp/StanzaFile.new.gpfs300', 'src': '/root/.ansible/tmp/ansible-tmp-1585001588.5356421-65289086087216/source', 'md5sum': '25534cf1001db2ac025b49d2c05a1332', 'checksum': 'bb2265de9b4dc8d6abbceadf832aea081427a9dd', 'changed': True, 'uid': 0, 'gid': 0, 'owner': 'root', 'group': 'root', 'mode': '0644', 'state': 'file', 'secontext': 'unconfined_u:object_r:admin_home_t:s0', 'size': 262, 'invocation': {'module_args': {'src': '/root/.ansible/tmp/ansible-tmp-1585001588.5356421-65289086087216/source', 'dest': '/var/tmp/StanzaFile.new.gpfs300', 'mode': None, 'follow': False, '_original_basename': 'StanzaFile.j2', 'checksum': 'bb2265de9b4dc8d6abbceadf832aea081427a9dd', 'backup': False, 'force': True, 'content': None, 'validate': None, 'directory_mode': None, 'remote_src': None, 'local_follow': None, 'owner': None, 'group': None, 'seuser': None, 'serole': None, 'selevel': None, 'setype': None, 'attributes': None, 'regexp': None, 'delimiter': None, 'unsafe_writes': None}}, 'failed': False, 'item': 'gpfs300', 'ansible_loop_var': 'item'})
failed: [node-vm1] (item={'diff': {'before': {'path': '/var/tmp/StanzaFile.new.gpfs2000'}, 'after': {'path': '/var/tmp/StanzaFile.new.gpfs2000'}}, 'path': '/var/tmp/StanzaFile.new.gpfs2000', 'changed': False, 'uid': 0, 'gid': 0, 'owner': 'root', 'group': 'root', 'mode': '0644', 'state': 'file', 'secontext': 'unconfined_u:object_r:admin_home_t:s0', 'size': 131, 'invocation': {'module_args': {'mode': None, 'follow': False, 'dest': '/var/tmp/StanzaFile.new.gpfs2000', '_original_basename': 'StanzaFile.j2', 'recurse': False, 'state': 'file', 'path': '/var/tmp/StanzaFile.new.gpfs2000', 'force': False, 'modification_time_format': '%Y%m%d%H%M.%S', 'access_time_format': '%Y%m%d%H%M.%S', '_diff_peek': None, 'src': None, 'modification_time': None, 'access_time': None, 'owner': None, 'group': None, 'seuser': None, 'serole': None, 'selevel': None, 'setype': None, 'attributes': None, 'content': None, 'backup': None, 'remote_src': None, 'regexp': None, 'delimiter': None, 'directory_mode': None, 'unsafe_writes': None}}, 'checksum': '7c5d84d86a6b23e28f29975eccd970850559265d', 'dest': '/var/tmp/StanzaFile.new.gpfs2000', 'failed': False, 'item': 'gpfs2000', 'ansible_loop_var': 'item'}) => {"ansible_loop_var": "item", "changed": true, "cmd": ["/usr/lpp/mmfs/bin/mmcrfs", "gpfs2000", "-F", "/var/tmp/StanzaFile.new.gpfs2000", "-B", "4M", "-m", "2", "-r", "2", "-n", "16", "-A", "yes", "-T", "/ibm/gpfs2000"], "delta": "0:00:02.411239", "end": "2020-03-23 15:13:26.068223", "item": {"ansible_loop_var": "item", "changed": false, "checksum": "7c5d84d86a6b23e28f29975eccd970850559265d", "dest": "/var/tmp/StanzaFile.new.gpfs2000", "diff": {"after": {"path": "/var/tmp/StanzaFile.new.gpfs2000"}, "before": {"path": "/var/tmp/StanzaFile.new.gpfs2000"}}, "failed": false, "gid": 0, "group": "root", "invocation": {"module_args": {"_diff_peek": null, "_original_basename": "StanzaFile.j2", "access_time": null, "access_time_format": "%Y%m%d%H%M.%S", "attributes": null, "backup": null, "content": null, "delimiter": null, "dest": "/var/tmp/StanzaFile.new.gpfs2000", "directory_mode": null, "follow": false, "force": false, "group": null, "mode": null, "modification_time": null, "modification_time_format": "%Y%m%d%H%M.%S", "owner": null, "path": "/var/tmp/StanzaFile.new.gpfs2000", "recurse": false, "regexp": null, "remote_src": null, "selevel": null, "serole": null, "setype": null, "seuser": null, "src": null, "state": "file", "unsafe_writes": null}}, "item": "gpfs2000", "mode": "0644", "owner": "root", "path": "/var/tmp/StanzaFile.new.gpfs2000", "secontext": "unconfined_u:object_r:admin_home_t:s0", "size": 131, "state": "file", "uid": 0}, "msg": "non-zero return code", "rc": 22, "start": "2020-03-23 15:13:23.656984", "stderr": "Incompatible parameters: Unable to create file system.\nChange one or more of the following as suggested and try again:\n    increase the number of failure groups\n    decrease the value for -r\nmmcrfs: tscrfs failed.  Cannot create gpfs2000\nmmcrfs: Command failed. Examine previous error messages to determine cause.", "stderr_lines": ["Incompatible parameters: Unable to create file system.", "Change one or more of the following as suggested and try again:", "    increase the number of failure groups", "    decrease the value for -r", "mmcrfs: tscrfs failed.  Cannot create gpfs2000", "mmcrfs: Command failed. Examine previous error messages to determine cause."], "stdout": "\nThe following disks of gpfs2000 will be formatted on node node-vm2:\n    nsd_2001: size 76800 MB\nFormatting file system ...\nDisks up to size 785.99 GB can be added to storage pool system.", "stdout_lines": ["", "The following disks of gpfs2000 will be formatted on node node-vm2:", "    nsd_2001: size 76800 MB", "Formatting file system ...", "Disks up to size 785.99 GB can be added to storage pool system."]}NO MORE HOSTS LEFT ****************************************************************************************************************************************PLAY RECAP ************************************************************************************************************************************************
node-vm1               : ok=78   changed=4    unreachable=0    failed=1    skipped=54   rescued=0    ignored=0
node-vm2               : ok=56   changed=0    unreachable=0    failed=0    skipped=32   rescued=0    ignored=0
node-vm3               : ok=56   changed=0    unreachable=0    failed=0    skipped=32   rescued=0    ignored=0
node-vm4               : ok=56   changed=0    unreachable=0    failed=0    skipped=32   rescued=0    ignored=0

Example of what just the 'mm' command failure output would be:
```[root@node-vm1 group_vars]# /usr/lpp/mmfs/bin/mmcrfs gpfs2000 -F /var/tmp/StanzaFile.new.gpfs2000 -B 4M -m 2 -r 2 -n 16 -A yes -T /ibm/gpfs2000The following disks of gpfs2000 will be formatted on node node-vm2:
    nsd_2001: size 76800 MB
Formatting file system ...
Disks up to size 785.99 GB can be added to storage pool system.
Incompatible parameters: Unable to create file system.
Change one or more of the following as suggested and try again:
    increase the number of failure groups
    decrease the value for -r
mmcrfs: tscrfs failed.  Cannot create gpfs2000
mmcrfs: Command failed. Examine previous error messages to determine cause.

nsd and filesystem inventory validation

Typo in all.yml went undetected and wound up building an undesired NSD name later resulting in an NSD creation conflict. Inventory validation will help usability and user error from leading to failures.

     - device: /dev/sdh
        nsd: nsd_2000
        servers: node-vm3,node-vm4
        failureGroup: 2
        usage: dataAndMetadata
        pool: system
        device: /dev/sdi
        nsd: nsd_2001
        servers: node-vm3,node-vm4
        failureGroup: 4
        usage: dataAndMetadata
        pool: system

It will work if you will change this to add a - in front of device

     - device: /dev/sdh
        nsd: nsd_2000
        servers: node-vm3,node-vm4
        failureGroup: 2
        usage: dataAndMetadata
        pool: system
     -  device: /dev/sdi
        nsd: nsd_2001
        servers: node-vm3,node-vm4
        failureGroup: 4
        usage: dataAndMetadata
        pool: system

wait-daemon-active - dict object' has no attribute 'stdout'

Running a playbook I faced the below error. Re-running the playbook completed without issues.

I have no further logs and I have faced this issue only once. Therefore I suspect that there might be a race condition.

RUNNING HANDLER [spectrum_scale_core/cluster : wait-daemon-active] ************************************************************************************************************************************************
fatal: [spectrumscale]: FAILED! => {"msg": "The conditional check 'state.stdout == 'active'' failed. The error was: error while evaluating conditional (state.stdout == 'active'): 'dict object' has no attribute 'stdout'"}

Refine install prereqs for building GPL module.

There are three tasks which install prereqs to build the GPL module. Two of them are skipped. This looks strange and buggy. I would suggest to either eliminate two of the tasks or to provide additional information so it becomes clear that this are three different prereqs.

TASK [spectrum_scale_core/node : build | Install prereqs for building GPL module from source] *********************************************************************************************************************
skipping: [spectrumscale]

TASK [spectrum_scale_core/node : build | Install prereqs for building GPL module from source] *********************************************************************************************************************
changed: [spectrumscale]

TASK [spectrum_scale_core/node : build | Install prereqs for building GPL from source] ****************************************************************************************************************************
skipping: [spectrumscale]

Directory based installation method should copy and consider any newly added rpms for installation.

Directory based installation method should copy and consider any newly added rpms for installation.

Currently i was intentionally not copying any rpm if directory is already exist but we need not to add this validation. Ansible will automatically check this and it will only copy if there is any change in the user defined directory.

TASK [core/node : install | Copy installation package to node] ******************************************
ok: [test-vm3]
**changed:** [test-vm1]

@mrolyat @danandar FYI. I have fixed this in my local system , this is required minor fix.

Check the configuration for required variables and fail the playbook early on with a good message

For required variables, could we check the configuration as we start running the playbooks and fail if missing with a message?

I do see this comment:

The filesystem parameter is mandatory, servers, and the device parameter is mandatory for each of the file system's disks. All other file system and disk parameters are optional. Hence, a minimal file system configuration would look like this:

I am currently using this repo in multiple internal deployment for automation where I clone down the repo and branch. Right now we have no official tags but we are changing things pretty drastically. In order to hold some stability, I will clone against my forked copy. But as I move master up to track upstream, I potentially break things.

I just tested a branch based to a more recent commit, and so I hit code changes that required servers to be defined. The error I hit is:

fatal: [worker4]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'servers'\n\nThe error appears to be in '/root/ibm-spectrum-scale-install-infra/roles/core/cluster/tasks/storage.yml': line 46, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: storage | Find defined NSDs\n ^ here\n"}

So while this is OK for me, to go into that play and try and figure out what is missing... I had to add some debug to make sure it was servers missing.. I believe it would drastically improve the usability if we had a check up front for the required values and stop the playbook with a informative message.

And I fully understand that it's documented in the README, and we could consider this "user error", but unless we have changelog and release info, most customers are not going to continuously read the REAMDE and/or diff the README file to see what has changed between one commit to another commit.

Commands should not hang indefinitely - add retry and timeout

Upon attempting to re-run ansible.sh to add a new node (RHEL7.7) to an existing cluster (4 RHEL8.1 nodes) something occurred to prevent the "Start deamons" command from completing and hung for over an hour with no failure or apparent retry or re-issue of the command.

Would be ideal if:

  1. user screen saw what command was being issued (to better troubleshoot or understand what is hanging)
  2. user screen saw a retry attempt (i.e. retrying 1 out of 5 times)
  3. user screen saw a retry interval (i.e. retrying 1 out of 5 times, waiting 30 more seconds).
  4. user could configure their own retry interval if desired (i.e. willing to wait 10 times with a 30 second wait interval vs a default setting provided by the playbook).

This hang "may" have been result of ssh key eschange reported in issue #52
The key exchange was performed from another login but the playbook still never timed out or continued to retry the command.

skipping: [shockjaw-vm1]
skipping: [shockjaw-vm3]
skipping: [shockjaw-vm2]
skipping: [shockjaw-vm4]

Running the playbook on master branch, on CentOS 7.6, fails at Zimon task

I'm not doing anything with Zimon, but it goes into that role from the playbook.yml definitions, which is fine.

Failed here: - name: install | Find gpfs.collector (gpfs.collector) RPM

But after some digging.. the cause of this is because zimon_url is not defined. The set_facts right above it is skipped because of the missing test for CentOS.

https://github.com/IBM/ibm-spectrum-scale-install-infra/blob/master/roles/zimon/node/tasks/install_local_pkg.yml#L100

We seem to have other mentions of CentOS

$ git grep CentOS
roles/core/node/tasks/build.yml:        ansible_distribution == 'CentOS' or

Quick fix:

(molecule.venv) [root@autogen-centos76-x-master ibm-spectrum-scale-install-infra]# git diff roles/zimon/node/tasks/install_local_pkg.yml
diff --git a/roles/zimon/node/tasks/install_local_pkg.yml b/roles/zimon/node/tasks/install_local_pkg.yml
index 6e10489..59722c3 100644
--- a/roles/zimon/node/tasks/install_local_pkg.yml
+++ b/roles/zimon/node/tasks/install_local_pkg.yml
@@ -100,7 +100,9 @@
 - name: install | zimon path
   set_fact:
     zimon_url: 'zimon_rpms/rhel7/'
-  when: ansible_distribution == 'RedHat' and ansible_distribution_major_version == '7'
+  when:
+    - ansible_distribution == 'RedHat' or ansible_distribution == 'CentOS'
+    - ansible_distribution_major_version == '7'

 - name: install | zimon path
   set_fact:

With this patch, it does run to completion. I will try the dev branch now.

precheck ssh config (ssh_args) in ansible.cfg - help prevent intermittent hangs

Some internal testing has revealed intermittent hangs when running playbooks. (currently only observed on RHEL7.6)

Small enhancement request to precheck config during install prechecks to see if ssh_args set to a value proved to help some node instances remove the intermittent hangs.

Am able to debug your hang issue ,  found issue with ssh connectivity .

I found one Ansible link and used this suggestion and able to install successfully on your cluster by changing one paramter in ansible.cfg file.
 
[https://stackoverflow.com/questions/51675831/ansible-stops-connecting-to-the-host-via-ssh]

Believe the connection may be dropping due to the lack of output from your play.
Add the following to your ssh_args in ansible.cfg:
-o ServerAliveInterval=50

I am able to create cluster successfully on your cluster environment using latest ansible code.
 

GPFS cluster information
========================
  GPFS cluster name:         gpfs1.local
  GPFS cluster id:           6993155638008093391
  GPFS UID domain:           gpfs1.local
  Remote shell command:      /usr/bin/ssh
  Remote file copy command:  /usr/bin/scp
  Repository type:           CCR
GPFS cluster configuration servers:
-----------------------------------
  Primary server:    node-51.localnet.com (not in use)
  Secondary server:  (none)
 Node  Daemon node name       IP address   Admin node name        Designation
------------------------------------------------------------------------------
   1   node-51.localnet.com  10.0.100.51  node-51.localnet.com  quorum-manager-perfmon
   2   node-52.localnet.com  10.0.100.52  node-52.localnet.com  quorum-manager-perfmon
   3   node-53.localnet.com  10.0.100.53  node-53.localnet.com  quorum-manager-perfmon
   4   node-54.localnet.com  10.0.100.54  node-54.localnet.com  quorum-manager-perfmon
 

What is purpose of var/scale_clusterdefinition.json?

I stumbled over file var/scale_clusterdefinition.json and I am wondering if this is a left over that can be removed.

I successfully installed and configured a single node Spectrum Scale cluster by customizing the files hosts and group_vars/all. The content in var/scale_clusterdefinition.json describes a different cluster configuration that does not have any impact on my installation.

My suggestion would be to either remove this file or to provide proper documentation on purposae and usage.

My hosts (customized by me):

# hosts:
[cluster]
10.1.1.20  scale_cluster_quorum=true scale_cluster_manager=true scale_cluster_gui=false

My group_vars/all (customized by me):

# group_vars/all:
---
scale_storage:
  - filesystem: gpfs
    automaticMountOption: treu
    disks:
      - device: /dev/sdb
        servers: 10.1.1.20

Content in var/scale_clusterdefinition.json (as in GitHub. Unchanged):

{
  "node_details": [
    {
      "fqdn" : host-vm1,
      "ip_address" : 192.168.100.101,
      "is_nsd_server": True,
      "is_quorum_node" : True,
      "is_manager_node" : True,
      "is_gui_server" : False,
    },
    {
      "fqdn" : host-vm2,
      "ip_address" : 192.168.100.102,
      "is_nsd_server": True,
      "is_quorum_node" : True,
      "is_manager_node" : True,
      "is_gui_server" : False,
    },
  ],
  "scale_storage":[
    {
      "filesystem": "fs1",
      "blockSize": 4M,
      "defaultDataReplicas": 1,
      "defaultMountPoint": "/mnt/fs1",
      "disks": [
       {
        "device": "/dev/sdd",
        "nsd": "nsd1",
        "servers": "host-vm1"
       },
       {
        "device": "/dev/sdf",
        "nsd": "nsd2",
        "servers": "host-vm1,host-vm2"
       }
      ]
    }
  ]
}

NSD creation being skipped when using host_vars...

Need some help... I'm backed to the current dev branch (also tried master)... . using host_vars, where I have disks on workers, seems to be skipped in creating the NSD.

Is there something wrong with my config? I do think this was working in the past... How to best debug this?

Here's my host_vars file:

# ls -ltr host_vars/
total 8
-rw-r--r-- 1 root root   0 Apr 16 20:07 autogen-hostvars-rhels77-x-master
-rw-r--r-- 1 root root 580 Apr 16 20:07 autogen-hostvars-rhels77-x-worker1
-rw-r--r-- 1 root root 580 Apr 16 20:07 autogen-hostvars-rhels77-x-worker2

I tried it both with the content of this file and removing those lines indicated, no changes, still skipped NSDs..

# cat host_vars/autogen-hostvars-rhels77-x-worker1
scale_storage:
  - filesystem: gpfs01
    blockSize: 4M
    numNodes: 16
    automaticMountOption: true
    defaultMountPoint: /mnt/gpfs01
    # force overwrite the NSDs since we probably did not clean up prior
    overwriteNSDs: true
    disks:
      - device: /dev/vdb
-        failureGroup: "2"
-        usage: "dataAndMetadata"
        servers: "autogen-hostvars-rhels77-x-worker1"
      - device: /dev/vdc
        nsd: "autogen_hostvars_rhels77_x_worker1_nsd_vdc"
-        failureGroup: "2"
-        usage: "dataAndMetadata"
        servers: "autogen-hostvars-rhels77-x-worker1"

But then I see the tasks being skipped....

TASK [core/cluster : storage | Prepare StanzaFile(s) for NSD creation] ***************************************************
changed: [autogen-hostvars-rhels77-x-master] => (item=gpfs01)

TASK [core/cluster : storage | Accept server license for NSD servers] ****************************************************
skipping: [autogen-hostvars-rhels77-x-master]

TASK [core/cluster : storage | Create new NSDs] **************************************************************************
skipping: [autogen-hostvars-rhels77-x-master] => (item={u'changed': True, u'uid': 0, u'dest': u'/var/tmp/StanzaFile.new.gpfs01', u'owner': u'root', 'diff': [], u'size': 1, u'src': u'/root/.ansible/tmp/ansible-tmp-1587093686.15-28107-223240455255342/source', 'ansible_loop_var': u'item', u'group': u'root', 'item': u'gpfs01', u'checksum': u'adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', u'md5sum': u'68b329da9893e34099c7d8ad5cb9c940', 'failed': False, u'state': u'file', u'gid': 0, u'mode': u'0644', u'invocation': {u'module_args': {u'directory_mode': None, u'force': True, u'remote_src': None, u'dest': u'/var/tmp/StanzaFile.new.gpfs01', u'selevel': None, u'_original_basename': u'StanzaFile.j2', u'delimiter': None, u'regexp': None, u'owner': None, u'follow': False, u'validate': None, u'local_follow': None, u'src': u'/root/.ansible/tmp/ansible-tmp-1587093686.15-28107-223240455255342/source', u'group': None, u'unsafe_writes': None, u'checksum': u'adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', u'seuser': None, u'serole': None, u'content': None, u'setype': None, u'mode': None, u'attributes': None, u'backup': False}}})

TASK [core/cluster : storage | Prepare StanzaFile(s) for filesystem creation] ********************************************

TASK [core/cluster : storage | Consolidate defined filesystem parameters] ************************************************
ok: [autogen-hostvars-rhels77-x-master] => (item={u'numNodes': 16, u'overwriteNSDs': True, u'blockSize': u'4M', u'disks': [{u'device': u'/dev/vdb', u'usage': u'dataAndMetadata', u'failureGroup': u'2', u'servers': u'autogen-hostvars-rhels77-x-worker1'}, {u'device': u'/dev/vdc', u'usage': u'dataAndMetadata', u'failureGroup': u'2', u'nsd': u'autogen_hostvars_rhels77_x_worker1_nsd_vdc', u'servers': u'autogen-hostvars-rhels77-x-worker1'}], u'filesystem': u'gpfs01', u'defaultMountPoint': u'/mnt/gpfs01', u'automaticMountOption': True})
ok: [autogen-hostvars-rhels77-x-master] => (item={u'numNodes': 16, u'overwriteNSDs': True, u'blockSize': u'4M', u'disks': [{u'device': u'/dev/vdb', u'usage': u'dataAndMetadata', u'failureGroup': u'2', u'servers': u'autogen-hostvars-rhels77-x-worker2'}, {u'device': u'/dev/vdc', u'usage': u'dataAndMetadata', u'failureGroup': u'2', u'nsd': u'autogen_hostvars_rhels77_x_worker2_nsd_vdc', u'servers': u'autogen-hostvars-rhels77-x-worker2'}], u'filesystem': u'gpfs01', u'defaultMountPoint': u'/mnt/gpfs01', u'automaticMountOption': True})

TASK [core/cluster : storage | Prepare StanzaFile(s) for NSD creation] ***************************************************
changed: [autogen-hostvars-rhels77-x-master] => (item=gpfs01)

TASK [core/cluster : storage | Accept server license for NSD servers] ****************************************************
skipping: [autogen-hostvars-rhels77-x-master]

TASK [core/cluster : storage | Create new NSDs] **************************************************************************
skipping: [autogen-hostvars-rhels77-x-master] => (item={u'changed': True, u'uid': 0, u'dest': u'/var/tmp/StanzaFile.new.gpfs01', u'owner': u'root', 'diff': [], u'size': 1, u'src': u'/root/.ansible/tmp/ansible-tmp-1587093686.15-28107-223240455255342/source', 'ansible_loop_var': u'item', u'group': u'root', 'item': u'gpfs01', u'checksum': u'adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', u'md5sum': u'68b329da9893e34099c7d8ad5cb9c940', 'failed': False, u'state': u'file', u'gid': 0, u'mode': u'0644', u'invocation': {u'module_args': {u'directory_mode': None, u'force': True, u'remote_src': None, u'dest': u'/var/tmp/StanzaFile.new.gpfs01', u'selevel': None, u'_original_basename': u'StanzaFile.j2', u'delimiter': None, u'regexp': None, u'owner': None, u'follow': False, u'validate': None, u'local_follow': None, u'src': u'/root/.ansible/tmp/ansible-tmp-1587093686.15-28107-223240455255342/source', u'group': None, u'unsafe_writes': None, u'checksum': u'adc83b19e793491b1c6ea0fd8b46cd9f32e592fc', u'seuser': None, u'serole': None, u'content': None, u'setype': None, u'mode': None, u'attributes': None, u'backup': False}}})

and then after the playbook is completed

 ls -ltr /var/tmp
total 12
drwx------ 3 root root  17 Apr 16 19:27 systemd-private-9beb40d1cb4f4c13823a7eef65a7ea1b-ntpd.service-GMMAbo
-rw-r--r-- 1 root root 161 Apr 16 20:18 ChangeFile
-rw-r--r-- 1 root root   1 Apr 16 20:21 StanzaFile.new.gpfs01
-rw-r--r-- 1 root root   1 Apr 16 20:21 StanzaFile.gpfs01
# cat /var/tmp/StanzaFile.gpfs01

# cat /var/tmp/StanzaFile.new.gpfs01

# mmlsnsd
mmlsnsd: [I] No disks were found.

On another cluser, using group_vars with this config works, no problem

# cat group_vars/all
scale_storage:
  - filesystem: gpfs01
    overwriteNSDs: true
    disks:
      - device: /dev/vdb
        servers: autogen-groupvars-rhels77-x-worker1
      - device: /dev/vdc
        servers: autogen-groupvars-rhels77-x-worker1
      - device: /dev/vdb
        servers: autogen-groupvars-rhels77-x-worker2
      - device: /dev/vdc
        servers: autogen-groupvars-rhels77-x-worker2

Check that Ansible playbooks run on different LANGUAGE settings

Several field issues were encountered with existing install toolkit where the server was running in a different language. Opening an issue to run a few tests to ensure Ansible can handle the different languages and still run successfully.

This was typically corrected by:
1 ) AFAIK, the export command is supported in all flavors of Linux (RH, Ubuntu, SUSE) and AIX.

2 ) Well, I usually use UTF-8 and works well for most of application. I faced a situation where a client was using en_US and the msgs was only showing properly when using UTF-8. I found this explanation that guides me to adopt en_US.UTF-8 rather than en_US.

The only difference between en_US and en_US.UTF8 is that the former uses ISO-8859-1 for a character set, while the latter uses UTF-8. Prefer UTF-8. The only difference in these is in what characters they are capable of representing. ISO-8859-1 represents characters common to many Americans (the English alphabet, plus a few letters with accents), whereas UTF-8 encodes all of Unicode, and thus, just about any language you can think of. UTF-8, today, is a defacto standard encoding for text.

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.4/com.ibm.spectrum.scale.v5r04.doc/bl1ins_preparingtousetoolkit.htm

https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.4/com.ibm.spectrum.scale.v5r04.doc/bl1ins_limitationsofthetoolkit.htm

Non-English languages in client programs such as PuTTY The installation toolkit does not support setting the language in client programs such as PuTTY to any language other than English. Set the language of your client program to English.

note

locale_gen:
name: en_US.UTF-8
state: present
- name: set as default locale
command: localectl set-locale LANG=en_US

README notes/suggestions

I did a document review of the README as someone who isn't intimately familiar with the product:

  • host_vars poorly explained
    • Took me several paragraphs to realize that you define one per node.
    • Why do we need to define this for every node? Seems like a lot of manual work?
  • Maybe host_vars is better as a documented yaml?
  • We shouldn’t need to modify the playbook if we have an inventory file.
  • “Defining the variable scale_version is mandatory” - replace with a table?

"install | Compare checksums" fails, if whitespace in path name

Failing task:

TASK [spectrum_scale_core/node : install | Compare checksums] **********************************************************
fatal: [spectrumscale]: FAILED! => {"msg": "The conditional check 'scale_install_md5_sum.strip().split().0 == scale_install_localpkg.stat.md5' failed. The error was: error while evaluating conditional (scale_install_md5_sum.strip().split().0 == scale_install_localpkg.stat.md5): 'dict object' has no attribute 'md5'"}

Failing playbook:

# spectrumscale.yml
---

- hosts: spectrumscale
  vars:
    - scale_install_localpkg_path: "/mylab/software/Spectrum Scale 5.0.4.1 Developer Edition/Spectrum_Scale_Developer-5.0.4.1-x86_64-Linux-install"
install"
  roles:
    - spectrum_scale_core/precheck
    - spectrum_scale_core/node
    - spectrum_scale_core/cluster

The issue can be mitigated avoiding spaces in directory names.

Unfortunately the Developer Edition extracts per default to a directory with spaces in the name. Therefore it can be expected that folks that tries the Developer Edition runs into this issue.

validation must be added for localpkg V/s directory path installation

i ran into an issue with package extraction with directory path given , and rpms not extracted

TASK [core/node : install | Copy installation package to node] ******************************************************************************************************************
changed: [snowwraith-vm1]
changed: [snowwraith-vm3]TASK [core/node : install | Extract installation package] ***********************************************************************************************************************
fatal: [snowwraith-vm1]: FAILED! => {"changed": false, "cmd": "/tmp/scale_rpms --silent", "msg": "[Errno 13] Permission denied", "rc": 13}
fatal: [snowwraith-vm3]: FAILED! => {"changed": false, "cmd": "/tmp/scale_rpms --silent", "msg": "[Errno 13] Permission denied", "rc": 13}PLAY RECAP **********************************************************************************************************************************************************************
snowwraith-vm1 : ok=22 changed=2 unreachable=0 failed=1 skipped=20 rescued=0 ignored=0
snowwraith-vm3 : ok=17 changed=2 unreachable=0 failed=1 skipped=9 rescued=0 ignored=0

Playbook should provide proper error message if directory doesn't contain gpfs base package

TASK [common : check | If GPFS base package is exists or not] *********************************************************************************************************************
fatal: [scale-52 -> localhost]: FAILED! => {
    "assertion": "stat_result.matched > 0", 
    "changed": false, 
    "evaluated_to": false, 
    "msg": "Unable to determine GPFS version. Ensure the GPFS packages are available in a relative path to the specified directory."
}

Executing Spectrum Scale Install Package from /tmp could be blocked if /tmp is mounted with no_exec permissions

It is not uncommon to find /tmp mounted with a no_exec option for security reasons. We currently copy the self extracting Spectrum Scale Install Package to /tmp on every node and execute it from there, which will fail if /tmp is mounted with the no_exec option.

To get around this problem, we should copy the package to some other directory on the root file system and extract from there.

If scale_prepare_disable_firewall=FALSE do not check for the firewall package

If the firewall variable is set to FALSE, do no also take the time to check for the firewall packages. This should reduce the overall time for the playbook to run.

scale_prepare_disable_firewall : Firewall can be disabled. It can be either true or false.(By default, it is true). << Update to not check for firewall package installed. This does not cause any issues but extends run time.

Running the playbook on the dev branch fails at Create NDSs step with minimal group_vars config

Possibly having incorrect variable definitions, but I'm not sure where the error is based on following the README.

I'm running the playbook off of the dev branch and reading the README file. Under the install instructions, step 2, I'm going with the minimal vars file....

image

I have a 3 node cluster

  • master
  • worker1
  • worker2

There's only 2 disks on workers, there's no disks on the masters

I have created this group_vars/all file:

# cat group_vars/all
scale_storage:
  - filesystem: gpfs01
    overwriteNSDs: true
    disks:
      - device: /dev/vdb
        servers: autogen-centos76-x-worker1,autogen-centos76-x-worker2
      - device: /dev/vdc
        servers: autogen-centos76-x-worker1,autogen-centos76-x-worker2

And running the playbook, it fails at "create NSDs"

TASK [core/cluster : storage | Create new NSDs] *************************************************************
failed: [autogen-centos76-x-master] (item={u'changed': True, u'uid': 0, u'dest': u'/var/tmp/StanzaFile.new.gpfs01', u'owner': u'root', 'diff': [], u'size': 426, u'src': u'/root/.ansible/tmp/ansible-tmp-1584925465.15-242076789828031/source', 'ansible_loop_var': u'item', u'group': u'root', 'item': u'gpfs01', u'checksum': u'4aa55cdfd89693a224d1f0218519f4e5f3c6d91d', u'md5sum': u'1efddd5f6b9317038f51ae5af5ecfc10', 'failed': False, u'state': u'file', u'gid': 0, u'mode': u'0644', u'invocation': {u'module_args': {u'directory_mode': None, u'force': True, u'remote_src': None, u'dest': u'/var/tmp/StanzaFile.new.gpfs01', u'selevel': None, u'_original_basename': u'StanzaFile.j2', u'delimiter': None, u'regexp': None, u'owner': None, u'follow': False, u'validate': None, u'local_follow': None, u'src': u'/root/.ansible/tmp/ansible-tmp-1584925465.15-242076789828031/source', u'group': None, u'unsafe_writes': None, u'checksum': u'4aa55cdfd89693a224d1f0218519f4e5f3c6d91d', u'seuser': None, u'serole': None, u'content': None, u'setype': None, u'mode': None, u'attributes': None, u'backup': False}}}) => {"ansible_loop_var": "item", "changed": true, "cmd": ["/usr/lpp/mmfs/bin/mmcrnsd", "-F", "/var/tmp/StanzaFile.new.gpfs01", "-v", "no"], "delta": "0:00:00.656419", "end": "2020-03-22 18:04:29.807511", "item": {"ansible_loop_var": "item", "changed": true, "checksum": "4aa55cdfd89693a224d1f0218519f4e5f3c6d91d", "dest": "/var/tmp/StanzaFile.new.gpfs01", "diff": [], "failed": false, "gid": 0, "group": "root", "invocation": {"module_args": {"_original_basename": "StanzaFile.j2", "attributes": null, "backup": false, "checksum": "4aa55cdfd89693a224d1f0218519f4e5f3c6d91d", "content": null, "delimiter": null, "dest": "/var/tmp/StanzaFile.new.gpfs01", "directory_mode": null, "follow": false, "force": true, "group": null, "local_follow": null, "mode": null, "owner": null, "regexp": null, "remote_src": null, "selevel": null, "serole": null, "setype": null, "seuser": null, "src": "/root/.ansible/tmp/ansible-tmp-1584925465.15-242076789828031/source", "unsafe_writes": null, "validate": null}}, "item": "gpfs01", "md5sum": "1efddd5f6b9317038f51ae5af5ecfc10", "mode": "0644", "owner": "root", "size": 426, "src": "/root/.ansible/tmp/ansible-tmp-1584925465.15-242076789828031/source", "state": "file", "uid": 0}, "msg": "non-zero return code", "rc": 1, "start": "2020-03-22 18:04:29.151092", "stderr": "mmcrnsd: Name \"nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdb\" is not allowed.\nIt contains the following invalid special character:  ,\nmmcrnsd: Error found while processing stanza\n    %nsd:\n      device=/dev/vdb\n      nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdb\n      servers=autogen-centos76-x-worker1,autogen-centos76-x-worker2\n      usage=dataAndMetadata\n      failureGroup=-1\n      pool=system\nmmcrnsd: Name \"nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_sdc\" is not allowed.\nIt contains the following invalid special character:  ,\nmmcrnsd: Error found while processing stanza\n    %nsd:\n      device=/dev/sdc\n      nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_sdc\n      servers=autogen-centos76-x-worker1,autogen-centos76-x-worker2\n      usage=dataAndMetadata\n      failureGroup=-1\n      pool=system\nmmcrnsd: File /var/tmp/StanzaFile.new.gpfs01 does not contain any NSD descriptors or stanzas.\nmmcrnsd: Command failed. Examine previous error messages to determine cause.", "stderr_lines": ["mmcrnsd: Name \"nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdb\" is not allowed.", "It contains the following invalid special character:  ,", "mmcrnsd: Error found while processing stanza", "    %nsd:", "      device=/dev/vdb", "      nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdb", "      servers=autogen-centos76-x-worker1,autogen-centos76-x-worker2", "      usage=dataAndMetadata", "      failureGroup=-1", "      pool=system", "mmcrnsd: Name \"nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_sdc\" is not allowed.", "It contains the following invalid special character:  ,", "mmcrnsd: Error found while processing stanza", "    %nsd:", "      device=/dev/sdc", "      nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_sdc", "      servers=autogen-centos76-x-worker1,autogen-centos76-x-worker2", "      usage=dataAndMetadata", "      failureGroup=-1", "      pool=system", "mmcrnsd: File /var/tmp/StanzaFile.new.gpfs01 does not contain any NSD descriptors or stanzas.", "mmcrnsd: Command failed. Examine previous error messages to determine cause."], "stdout": "", "stdout_lines": []}

For better readability, I ran the command:

# /usr/lpp/mmfs/bin/mmcrnsd -F /var/tmp/StanzaFile.new.gpfs01 -v no
mmcrnsd: Name "nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdb" is not allowed.
It contains the following invalid special character:  ,
mmcrnsd: Error found while processing stanza
    %nsd:
      device=/dev/vdb
      nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdb
      servers=autogen-centos76-x-worker1,autogen-centos76-x-worker2
      usage=dataAndMetadata
      failureGroup=-1
      pool=system
mmcrnsd: Name "nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdc" is not allowed.
It contains the following invalid special character:  ,
mmcrnsd: Error found while processing stanza
    %nsd:
      device=/dev/vdc
      nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_vdc
      servers=autogen-centos76-x-worker1,autogen-centos76-x-worker2
      usage=dataAndMetadata
      failureGroup=-1
      pool=system

It's complaining about the comma and looks like the NSD generated names is not quite right....

nsd=nsd_autogen_centos76_x_worker1,autogen_centos76_x_worker2_sdc

Any thoughts here?

Pre-check does not find missing package: elfutils-libelf-devel

I am using a minimal OS image. The ansible roles fail, if package elfutils-libelf-devel is not installed.

Ansible output of failed step:

TASK [spectrum_scale_core/node : build | Compile GPL module] ***********************************************************
fatal: [spectrumscale]: FAILED! => {"changed": true, "cmd": "export LINUX_DISTRIBUTION=REDHAT_AS_LINUX ; /usr/lpp/mmfs/bin/mmbuildgpl --quiet", "delta": "0:00:03.969723", "end": "2020-04-16 21:28:41.310775", "msg": "non-zero return code", "rc": 1, "start": "2020-04-16 21:28:37.341052", "stderr": "Verifying that tools to build the portability layer exist....\ncpp present\ngcc present\ng++ present\nld present\ncd /usr/lpp/mmfs/src/config; /usr/bin/cpp -P def.mk.proto > ./def.mk; exit $? || exit 1\nrm -rf /usr/lpp/mmfs/src/include /usr/lpp/mmfs/src/bin /usr/lpp/mmfs/src/lib\nmkdir /usr/lpp/mmfs/src/include /usr/lpp/mmfs/src/bin /usr/lpp/mmfs/src/lib\nrm -f //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver\ncleaning (/usr/lpp/mmfs/src/ibm-kxi)\nmake[1]: Entering directory '/usr/lpp/mmfs/src/ibm-kxi'\nrm -f trcid.h ibm_kxi.trclst\nrm -f  install.he; \\\n for i in cxiTypes.h cxiSystem.h cxi2gpfs.h cxiVFSStats.h cxiCred.h cxiIOBuffer.h cxiSharedSeg.h cxiMode.h Trace.h cxiMmap.h cxiAtomic.h cxiTSFattr.h cxiAclUser.h cxiLinkList.h cxiDmapi.h LockNames.h lxtrace.h cxiGcryptoDefs.h cxiSynchNames.h cxiMiscNames.h DirIds.h; do \\\n    (set -x; rm -f -r /usr/lpp/mmfs/src/include/cxi/$i) done \n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiTypes.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSystem.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxi2gpfs.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiVFSStats.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiCred.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMode.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/Trace.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMmap.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiAtomic.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiTSFattr.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiAclUser.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiLinkList.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiDmapi.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/LockNames.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/lxtrace.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiGcryptoDefs.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSynchNames.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMiscNames.h\n+ rm -f -r /usr/lpp/mmfs/src/include/cxi/DirIds.h\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-kxi'\ncleaning (/usr/lpp/mmfs/src/ibm-linux)\nmake[1]: Entering directory '/usr/lpp/mmfs/src/ibm-linux'\nrm -f install.he; \\\n for i in cxiTypes-plat.h cxiSystem-plat.h cxiIOBuffer-plat.h cxiSharedSeg-plat.h cxiMode-plat.h Trace-plat.h cxiAtomic-plat.h cxiMmap-plat.h cxiVFSStats-plat.h cxiCred-plat.h cxiDmapi-plat.h; do \\\n                (set -x; rm -rf /usr/lpp/mmfs/src/include/cxi/$i) done\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiTypes-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiSystem-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiMode-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/Trace-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiAtomic-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiMmap-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiVFSStats-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiCred-plat.h\n+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiDmapi-plat.h\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-linux'\ncleaning (/usr/lpp/mmfs/src/gpl-linux)\nmake[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'\nPre-kbuild step 1...\n/usr/bin/make -C /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64 M=/usr/lpp/mmfs/src/gpl-linux clean\nmake[2]: Entering directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'\nmake[2]: Leaving directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'\nrm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/tracedev.ko\nrm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/mmfslinux.ko\nrm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/mmfs26.ko\nrm -f -f /usr/lpp/mmfs/src/../bin/lxtrace-`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`\nrm -f -f /usr/lpp/mmfs/src/../bin/kdump-`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`\nrm -f -f *.o .depends .*.cmd *.ko *.a *.mod.c core *_shipped *map *mod.c.saved *.symvers *.ko.ver ./*.ver install.he\nrm -f -rf .tmp_versions kdump-kern-dwarfs.c\nrm -f -f gpl-linux.trclst kdump lxtrace\nrm -f -rf usr\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'\nfor i in ibm-kxi ibm-linux gpl-linux ; do \\\n(cd $i; echo  \"installing header files\" \"(`pwd`)\"; \\\n/usr/bin/make DESTDIR=/usr/lpp/mmfs/src  Headers; \\\nexit $?) || exit 1; \\\ndone\ninstalling header files (/usr/lpp/mmfs/src/ibm-kxi)\nmake[1]: Entering directory '/usr/lpp/mmfs/src/ibm-kxi'\nMaking directory /usr/lpp/mmfs/src/include/cxi\n+ /usr/bin/install cxiTypes.h /usr/lpp/mmfs/src/include/cxi/cxiTypes.h\n+ /usr/bin/install cxiSystem.h /usr/lpp/mmfs/src/include/cxi/cxiSystem.h\n+ /usr/bin/install cxi2gpfs.h /usr/lpp/mmfs/src/include/cxi/cxi2gpfs.h\n+ /usr/bin/install cxiVFSStats.h /usr/lpp/mmfs/src/include/cxi/cxiVFSStats.h\n+ /usr/bin/install cxiCred.h /usr/lpp/mmfs/src/include/cxi/cxiCred.h\n+ /usr/bin/install cxiIOBuffer.h /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer.h\n+ /usr/bin/install cxiSharedSeg.h /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg.h\n+ /usr/bin/install cxiMode.h /usr/lpp/mmfs/src/include/cxi/cxiMode.h\n+ /usr/bin/install Trace.h /usr/lpp/mmfs/src/include/cxi/Trace.h\n+ /usr/bin/install cxiMmap.h /usr/lpp/mmfs/src/include/cxi/cxiMmap.h\n+ /usr/bin/install cxiAtomic.h /usr/lpp/mmfs/src/include/cxi/cxiAtomic.h\n+ /usr/bin/install cxiTSFattr.h /usr/lpp/mmfs/src/include/cxi/cxiTSFattr.h\n+ /usr/bin/install cxiAclUser.h /usr/lpp/mmfs/src/include/cxi/cxiAclUser.h\n+ /usr/bin/install cxiLinkList.h /usr/lpp/mmfs/src/include/cxi/cxiLinkList.h\n+ /usr/bin/install cxiDmapi.h /usr/lpp/mmfs/src/include/cxi/cxiDmapi.h\n+ /usr/bin/install LockNames.h /usr/lpp/mmfs/src/include/cxi/LockNames.h\n+ /usr/bin/install lxtrace.h /usr/lpp/mmfs/src/include/cxi/lxtrace.h\n+ /usr/bin/install cxiGcryptoDefs.h /usr/lpp/mmfs/src/include/cxi/cxiGcryptoDefs.h\n+ /usr/bin/install cxiSynchNames.h /usr/lpp/mmfs/src/include/cxi/cxiSynchNames.h\n+ /usr/bin/install cxiMiscNames.h /usr/lpp/mmfs/src/include/cxi/cxiMiscNames.h\n+ /usr/bin/install DirIds.h /usr/lpp/mmfs/src/include/cxi/DirIds.h\ntouch install.he\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-kxi'\ninstalling header files (/usr/lpp/mmfs/src/ibm-linux)\nmake[1]: Entering directory '/usr/lpp/mmfs/src/ibm-linux'\n+ /usr/bin/install cxiTypes-plat.h /usr/lpp/mmfs/src/include/cxi/cxiTypes-plat.h\n+ /usr/bin/install cxiSystem-plat.h /usr/lpp/mmfs/src/include/cxi/cxiSystem-plat.h\n+ /usr/bin/install cxiIOBuffer-plat.h /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer-plat.h\n+ /usr/bin/install cxiSharedSeg-plat.h /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg-plat.h\n+ /usr/bin/install cxiMode-plat.h /usr/lpp/mmfs/src/include/cxi/cxiMode-plat.h\n+ /usr/bin/install Trace-plat.h /usr/lpp/mmfs/src/include/cxi/Trace-plat.h\n+ /usr/bin/install cxiAtomic-plat.h /usr/lpp/mmfs/src/include/cxi/cxiAtomic-plat.h\n+ /usr/bin/install cxiMmap-plat.h /usr/lpp/mmfs/src/include/cxi/cxiMmap-plat.h\n+ /usr/bin/install cxiVFSStats-plat.h /usr/lpp/mmfs/src/include/cxi/cxiVFSStats-plat.h\n+ /usr/bin/install cxiCred-plat.h /usr/lpp/mmfs/src/include/cxi/cxiCred-plat.h\n+ /usr/bin/install cxiDmapi-plat.h /usr/lpp/mmfs/src/include/cxi/cxiDmapi-plat.h\ntouch install.he\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-linux'\ninstalling header files (/usr/lpp/mmfs/src/gpl-linux)\nmake[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'\nMaking directory /usr/lpp/mmfs/src/include/gpl-linux\n+ /usr/bin/install Shark-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/Shark-gpl.h\n+ /usr/bin/install prelinux.h /usr/lpp/mmfs/src/include/gpl-linux/prelinux.h\n+ /usr/bin/install postlinux.h /usr/lpp/mmfs/src/include/gpl-linux/postlinux.h\n+ /usr/bin/install linux2gpfs.h /usr/lpp/mmfs/src/include/gpl-linux/linux2gpfs.h\n+ /usr/bin/install verdep.h /usr/lpp/mmfs/src/include/gpl-linux/verdep.h\n+ /usr/bin/install Logger-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/Logger-gpl.h\n+ /usr/bin/install arch-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/arch-gpl.h\n+ /usr/bin/install oplock.h /usr/lpp/mmfs/src/include/gpl-linux/oplock.h\ntouch install.he\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'\nmake[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'\nPre-kbuild step 1...\nPre-kbuild step 2...\ntouch install.he\nInvoking Kbuild...\n/usr/bin/make -C /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64 ARCH=x86_64 M=/usr/lpp/mmfs/src/gpl-linux CONFIGDIR=/usr/lpp/mmfs/src/config  ; \\\nif [ $? -ne 0 ]; then \\\n\texit 1;\\\nfi \nmake[2]: Entering directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'\nMakefile:977: *** \"Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel\".  Stop.\nmake[2]: Leaving directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'\nmake[1]: *** [makefile:131: modules] Error 1\nmake[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'\nmake: *** [makefile:148: Modules] Error 1\nmmbuildgpl: Command failed. Examine previous error messages to determine cause.", "stderr_lines": ["Verifying that tools to build the portability layer exist....", "cpp present", "gcc present", "g++ present", "ld present", "cd /usr/lpp/mmfs/src/config; /usr/bin/cpp -P def.mk.proto > ./def.mk; exit $? || exit 1", "rm -rf /usr/lpp/mmfs/src/include /usr/lpp/mmfs/src/bin /usr/lpp/mmfs/src/lib", "mkdir /usr/lpp/mmfs/src/include /usr/lpp/mmfs/src/bin /usr/lpp/mmfs/src/lib", "rm -f //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver", "cleaning (/usr/lpp/mmfs/src/ibm-kxi)", "make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-kxi'", "rm -f trcid.h ibm_kxi.trclst", "rm -f  install.he; \\", " for i in cxiTypes.h cxiSystem.h cxi2gpfs.h cxiVFSStats.h cxiCred.h cxiIOBuffer.h cxiSharedSeg.h cxiMode.h Trace.h cxiMmap.h cxiAtomic.h cxiTSFattr.h cxiAclUser.h cxiLinkList.h cxiDmapi.h LockNames.h lxtrace.h cxiGcryptoDefs.h cxiSynchNames.h cxiMiscNames.h DirIds.h; do \\", "    (set -x; rm -f -r /usr/lpp/mmfs/src/include/cxi/$i) done ", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiTypes.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSystem.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxi2gpfs.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiVFSStats.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiCred.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMode.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/Trace.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMmap.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiAtomic.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiTSFattr.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiAclUser.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiLinkList.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiDmapi.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/LockNames.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/lxtrace.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiGcryptoDefs.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSynchNames.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMiscNames.h", "+ rm -f -r /usr/lpp/mmfs/src/include/cxi/DirIds.h", "make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-kxi'", "cleaning (/usr/lpp/mmfs/src/ibm-linux)", "make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-linux'", "rm -f install.he; \\", " for i in cxiTypes-plat.h cxiSystem-plat.h cxiIOBuffer-plat.h cxiSharedSeg-plat.h cxiMode-plat.h Trace-plat.h cxiAtomic-plat.h cxiMmap-plat.h cxiVFSStats-plat.h cxiCred-plat.h cxiDmapi-plat.h; do \\", "                (set -x; rm -rf /usr/lpp/mmfs/src/include/cxi/$i) done", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiTypes-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiSystem-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiMode-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/Trace-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiAtomic-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiMmap-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiVFSStats-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiCred-plat.h", "+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiDmapi-plat.h", "make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-linux'", "cleaning (/usr/lpp/mmfs/src/gpl-linux)", "make[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'", "Pre-kbuild step 1...", "/usr/bin/make -C /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64 M=/usr/lpp/mmfs/src/gpl-linux clean", "make[2]: Entering directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'", "make[2]: Leaving directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'", "rm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/tracedev.ko", "rm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/mmfslinux.ko", "rm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/mmfs26.ko", "rm -f -f /usr/lpp/mmfs/src/../bin/lxtrace-`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`", "rm -f -f /usr/lpp/mmfs/src/../bin/kdump-`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`", "rm -f -f *.o .depends .*.cmd *.ko *.a *.mod.c core *_shipped *map *mod.c.saved *.symvers *.ko.ver ./*.ver install.he", "rm -f -rf .tmp_versions kdump-kern-dwarfs.c", "rm -f -f gpl-linux.trclst kdump lxtrace", "rm -f -rf usr", "make[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'", "for i in ibm-kxi ibm-linux gpl-linux ; do \\", "(cd $i; echo  \"installing header files\" \"(`pwd`)\"; \\", "/usr/bin/make DESTDIR=/usr/lpp/mmfs/src  Headers; \\", "exit $?) || exit 1; \\", "done", "installing header files (/usr/lpp/mmfs/src/ibm-kxi)", "make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-kxi'", "Making directory /usr/lpp/mmfs/src/include/cxi", "+ /usr/bin/install cxiTypes.h /usr/lpp/mmfs/src/include/cxi/cxiTypes.h", "+ /usr/bin/install cxiSystem.h /usr/lpp/mmfs/src/include/cxi/cxiSystem.h", "+ /usr/bin/install cxi2gpfs.h /usr/lpp/mmfs/src/include/cxi/cxi2gpfs.h", "+ /usr/bin/install cxiVFSStats.h /usr/lpp/mmfs/src/include/cxi/cxiVFSStats.h", "+ /usr/bin/install cxiCred.h /usr/lpp/mmfs/src/include/cxi/cxiCred.h", "+ /usr/bin/install cxiIOBuffer.h /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer.h", "+ /usr/bin/install cxiSharedSeg.h /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg.h", "+ /usr/bin/install cxiMode.h /usr/lpp/mmfs/src/include/cxi/cxiMode.h", "+ /usr/bin/install Trace.h /usr/lpp/mmfs/src/include/cxi/Trace.h", "+ /usr/bin/install cxiMmap.h /usr/lpp/mmfs/src/include/cxi/cxiMmap.h", "+ /usr/bin/install cxiAtomic.h /usr/lpp/mmfs/src/include/cxi/cxiAtomic.h", "+ /usr/bin/install cxiTSFattr.h /usr/lpp/mmfs/src/include/cxi/cxiTSFattr.h", "+ /usr/bin/install cxiAclUser.h /usr/lpp/mmfs/src/include/cxi/cxiAclUser.h", "+ /usr/bin/install cxiLinkList.h /usr/lpp/mmfs/src/include/cxi/cxiLinkList.h", "+ /usr/bin/install cxiDmapi.h /usr/lpp/mmfs/src/include/cxi/cxiDmapi.h", "+ /usr/bin/install LockNames.h /usr/lpp/mmfs/src/include/cxi/LockNames.h", "+ /usr/bin/install lxtrace.h /usr/lpp/mmfs/src/include/cxi/lxtrace.h", "+ /usr/bin/install cxiGcryptoDefs.h /usr/lpp/mmfs/src/include/cxi/cxiGcryptoDefs.h", "+ /usr/bin/install cxiSynchNames.h /usr/lpp/mmfs/src/include/cxi/cxiSynchNames.h", "+ /usr/bin/install cxiMiscNames.h /usr/lpp/mmfs/src/include/cxi/cxiMiscNames.h", "+ /usr/bin/install DirIds.h /usr/lpp/mmfs/src/include/cxi/DirIds.h", "touch install.he", "make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-kxi'", "installing header files (/usr/lpp/mmfs/src/ibm-linux)", "make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-linux'", "+ /usr/bin/install cxiTypes-plat.h /usr/lpp/mmfs/src/include/cxi/cxiTypes-plat.h", "+ /usr/bin/install cxiSystem-plat.h /usr/lpp/mmfs/src/include/cxi/cxiSystem-plat.h", "+ /usr/bin/install cxiIOBuffer-plat.h /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer-plat.h", "+ /usr/bin/install cxiSharedSeg-plat.h /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg-plat.h", "+ /usr/bin/install cxiMode-plat.h /usr/lpp/mmfs/src/include/cxi/cxiMode-plat.h", "+ /usr/bin/install Trace-plat.h /usr/lpp/mmfs/src/include/cxi/Trace-plat.h", "+ /usr/bin/install cxiAtomic-plat.h /usr/lpp/mmfs/src/include/cxi/cxiAtomic-plat.h", "+ /usr/bin/install cxiMmap-plat.h /usr/lpp/mmfs/src/include/cxi/cxiMmap-plat.h", "+ /usr/bin/install cxiVFSStats-plat.h /usr/lpp/mmfs/src/include/cxi/cxiVFSStats-plat.h", "+ /usr/bin/install cxiCred-plat.h /usr/lpp/mmfs/src/include/cxi/cxiCred-plat.h", "+ /usr/bin/install cxiDmapi-plat.h /usr/lpp/mmfs/src/include/cxi/cxiDmapi-plat.h", "touch install.he", "make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-linux'", "installing header files (/usr/lpp/mmfs/src/gpl-linux)", "make[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'", "Making directory /usr/lpp/mmfs/src/include/gpl-linux", "+ /usr/bin/install Shark-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/Shark-gpl.h", "+ /usr/bin/install prelinux.h /usr/lpp/mmfs/src/include/gpl-linux/prelinux.h", "+ /usr/bin/install postlinux.h /usr/lpp/mmfs/src/include/gpl-linux/postlinux.h", "+ /usr/bin/install linux2gpfs.h /usr/lpp/mmfs/src/include/gpl-linux/linux2gpfs.h", "+ /usr/bin/install verdep.h /usr/lpp/mmfs/src/include/gpl-linux/verdep.h", "+ /usr/bin/install Logger-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/Logger-gpl.h", "+ /usr/bin/install arch-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/arch-gpl.h", "+ /usr/bin/install oplock.h /usr/lpp/mmfs/src/include/gpl-linux/oplock.h", "touch install.he", "make[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'", "make[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'", "Pre-kbuild step 1...", "Pre-kbuild step 2...", "touch install.he", "Invoking Kbuild...", "/usr/bin/make -C /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64 ARCH=x86_64 M=/usr/lpp/mmfs/src/gpl-linux CONFIGDIR=/usr/lpp/mmfs/src/config  ; \\", "if [ $? -ne 0 ]; then \\", "\texit 1;\\", "fi ", "make[2]: Entering directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'", "Makefile:977: *** \"Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel\".  Stop.", "make[2]: Leaving directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'", "make[1]: *** [makefile:131: modules] Error 1", "make[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'", "make: *** [makefile:148: Modules] Error 1", "mmbuildgpl: Command failed. Examine previous error messages to determine cause."], "stdout": "--------------------------------------------------------\nmmbuildgpl: Building GPL (5.0.4.1) module begins at Thu Apr 16 21:28:37 CEST 2020.\n--------------------------------------------------------\nVerifying Kernel Header...\n  kernel version = 41800147 (418000147008001, 4.18.0-147.8.1.el8_1.x86_64, 4.18.0-147.8.1) \n  module include dir = /lib/modules/4.18.0-147.8.1.el8_1.x86_64/build/include \n  module build dir   = /lib/modules/4.18.0-147.8.1.el8_1.x86_64/build \n  kernel source dir  = /usr/src/linux-4.18.0-147.8.1.el8_1.x86_64/include \n  Found valid kernel header file under /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64/include\nVerifying Compiler...\n  make is present at /bin/make\n  cpp is present at /bin/cpp\n  gcc is present at /bin/gcc\n  g++ is present at /bin/g++\n  ld is present at /bin/ld\nVerifying Additional System Headers...\n  Verifying kernel-headers is installed ...\n    Command: /bin/rpm -q kernel-headers  \n    The required package kernel-headers is installed\nmake World ...\n--------------------------------------------------------\nmmbuildgpl: Building GPL module failed at Thu Apr 16 21:28:41 CEST 2020.\n--------------------------------------------------------", "stdout_lines": ["--------------------------------------------------------", "mmbuildgpl: Building GPL (5.0.4.1) module begins at Thu Apr 16 21:28:37 CEST 2020.", "--------------------------------------------------------", "Verifying Kernel Header...", "  kernel version = 41800147 (418000147008001, 4.18.0-147.8.1.el8_1.x86_64, 4.18.0-147.8.1) ", "  module include dir = /lib/modules/4.18.0-147.8.1.el8_1.x86_64/build/include ", "  module build dir   = /lib/modules/4.18.0-147.8.1.el8_1.x86_64/build ", "  kernel source dir  = /usr/src/linux-4.18.0-147.8.1.el8_1.x86_64/include ", "  Found valid kernel header file under /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64/include", "Verifying Compiler...", "  make is present at /bin/make", "  cpp is present at /bin/cpp", "  gcc is present at /bin/gcc", "  g++ is present at /bin/g++", "  ld is present at /bin/ld", "Verifying Additional System Headers...", "  Verifying kernel-headers is installed ...", "    Command: /bin/rpm -q kernel-headers  ", "    The required package kernel-headers is installed", "make World ...", "--------------------------------------------------------", "mmbuildgpl: Building GPL module failed at Thu Apr 16 21:28:41 CEST 2020.", "--------------------------------------------------------"]}

PLAY RECAP *************************************************************************************************************

Manual run of mmbuildgpl:

[root@origin ansible]# ssh spectrumscale
Last login: Thu Apr 16 21:28:37 2020 from 10.1.1.10
[root@spectrumscale ~]# mmbuildgpl
--------------------------------------------------------
mmbuildgpl: Building GPL (5.0.4.1) module begins at Thu Apr 16 21:29:33 CEST 2020.
--------------------------------------------------------
Verifying Kernel Header...
  kernel version = 41800147 (418000147008001, 4.18.0-147.8.1.el8_1.x86_64, 4.18.0-147.8.1)
  module include dir = /lib/modules/4.18.0-147.8.1.el8_1.x86_64/build/include
  module build dir   = /lib/modules/4.18.0-147.8.1.el8_1.x86_64/build
  kernel source dir  = /usr/src/linux-4.18.0-147.8.1.el8_1.x86_64/include
  Found valid kernel header file under /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64/include
Verifying Compiler...
  make is present at /bin/make
  cpp is present at /bin/cpp
  gcc is present at /bin/gcc
  g++ is present at /bin/g++
  ld is present at /bin/ld
Verifying Additional System Headers...
  Verifying kernel-headers is installed ...
    Command: /bin/rpm -q kernel-headers
    The required package kernel-headers is installed
make World ...
Verifying that tools to build the portability layer exist....
cpp present
gcc present
g++ present
ld present
cd /usr/lpp/mmfs/src/config; /usr/bin/cpp -P def.mk.proto > ./def.mk; exit $? || exit 1
rm -rf /usr/lpp/mmfs/src/include /usr/lpp/mmfs/src/bin /usr/lpp/mmfs/src/lib
mkdir /usr/lpp/mmfs/src/include /usr/lpp/mmfs/src/bin /usr/lpp/mmfs/src/lib
rm -f //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver
cleaning (/usr/lpp/mmfs/src/ibm-kxi)
make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-kxi'
rm -f trcid.h ibm_kxi.trclst
rm -f  install.he; \
 for i in cxiTypes.h cxiSystem.h cxi2gpfs.h cxiVFSStats.h cxiCred.h cxiIOBuffer.h cxiSharedSeg.h cxiMode.h Trace.h cxiMmap.h cxiAtomic.h cxiTSFattr.h cxiAclUser.h cxiLinkList.h cxiDmapi.h LockNames.h lxtrace.h cxiGcryptoDefs.h cxiSynchNames.h cxiMiscNames.h DirIds.h; do \
    (set -x; rm -f -r /usr/lpp/mmfs/src/include/cxi/$i) done
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiTypes.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSystem.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxi2gpfs.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiVFSStats.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiCred.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMode.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/Trace.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMmap.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiAtomic.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiTSFattr.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiAclUser.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiLinkList.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiDmapi.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/LockNames.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/lxtrace.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiGcryptoDefs.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiSynchNames.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/cxiMiscNames.h
+ rm -f -r /usr/lpp/mmfs/src/include/cxi/DirIds.h
make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-kxi'
cleaning (/usr/lpp/mmfs/src/ibm-linux)
make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-linux'
rm -f install.he; \
 for i in cxiTypes-plat.h cxiSystem-plat.h cxiIOBuffer-plat.h cxiSharedSeg-plat.h cxiMode-plat.h Trace-plat.h cxiAtomic-plat.h cxiMmap-plat.h cxiVFSStats-plat.h cxiCred-plat.h cxiDmapi-plat.h; do \
                (set -x; rm -rf /usr/lpp/mmfs/src/include/cxi/$i) done
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiTypes-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiSystem-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiMode-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/Trace-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiAtomic-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiMmap-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiVFSStats-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiCred-plat.h
+ rm -rf /usr/lpp/mmfs/src/include/cxi/cxiDmapi-plat.h
make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-linux'
cleaning (/usr/lpp/mmfs/src/gpl-linux)
make[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'
Pre-kbuild step 1...
/usr/bin/make -C /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64 M=/usr/lpp/mmfs/src/gpl-linux clean
make[2]: Entering directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'
make[2]: Leaving directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'
rm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/tracedev.ko
rm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/mmfslinux.ko
rm -f -f /lib/modules/`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`/extra/mmfs26.ko
rm -f -f /usr/lpp/mmfs/src/../bin/lxtrace-`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`
rm -f -f /usr/lpp/mmfs/src/../bin/kdump-`cat //usr/lpp/mmfs/src/gpl-linux/gpl_kernel.tmp.ver`
rm -f -f *.o .depends .*.cmd *.ko *.a *.mod.c core *_shipped *map *mod.c.saved *.symvers *.ko.ver ./*.ver install.he
rm -f -rf .tmp_versions kdump-kern-dwarfs.c
rm -f -f gpl-linux.trclst kdump lxtrace
rm -f -rf usr
make[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'
for i in ibm-kxi ibm-linux gpl-linux ; do \
(cd $i; echo  "installing header files" "(`pwd`)"; \
/usr/bin/make DESTDIR=/usr/lpp/mmfs/src  Headers; \
exit $?) || exit 1; \
done
installing header files (/usr/lpp/mmfs/src/ibm-kxi)
make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-kxi'
Making directory /usr/lpp/mmfs/src/include/cxi
+ /usr/bin/install cxiTypes.h /usr/lpp/mmfs/src/include/cxi/cxiTypes.h
+ /usr/bin/install cxiSystem.h /usr/lpp/mmfs/src/include/cxi/cxiSystem.h
+ /usr/bin/install cxi2gpfs.h /usr/lpp/mmfs/src/include/cxi/cxi2gpfs.h
+ /usr/bin/install cxiVFSStats.h /usr/lpp/mmfs/src/include/cxi/cxiVFSStats.h
+ /usr/bin/install cxiCred.h /usr/lpp/mmfs/src/include/cxi/cxiCred.h
+ /usr/bin/install cxiIOBuffer.h /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer.h
+ /usr/bin/install cxiSharedSeg.h /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg.h
+ /usr/bin/install cxiMode.h /usr/lpp/mmfs/src/include/cxi/cxiMode.h
+ /usr/bin/install Trace.h /usr/lpp/mmfs/src/include/cxi/Trace.h
+ /usr/bin/install cxiMmap.h /usr/lpp/mmfs/src/include/cxi/cxiMmap.h
+ /usr/bin/install cxiAtomic.h /usr/lpp/mmfs/src/include/cxi/cxiAtomic.h
+ /usr/bin/install cxiTSFattr.h /usr/lpp/mmfs/src/include/cxi/cxiTSFattr.h
+ /usr/bin/install cxiAclUser.h /usr/lpp/mmfs/src/include/cxi/cxiAclUser.h
+ /usr/bin/install cxiLinkList.h /usr/lpp/mmfs/src/include/cxi/cxiLinkList.h
+ /usr/bin/install cxiDmapi.h /usr/lpp/mmfs/src/include/cxi/cxiDmapi.h
+ /usr/bin/install LockNames.h /usr/lpp/mmfs/src/include/cxi/LockNames.h
+ /usr/bin/install lxtrace.h /usr/lpp/mmfs/src/include/cxi/lxtrace.h
+ /usr/bin/install cxiGcryptoDefs.h /usr/lpp/mmfs/src/include/cxi/cxiGcryptoDefs.h
+ /usr/bin/install cxiSynchNames.h /usr/lpp/mmfs/src/include/cxi/cxiSynchNames.h
+ /usr/bin/install cxiMiscNames.h /usr/lpp/mmfs/src/include/cxi/cxiMiscNames.h
+ /usr/bin/install DirIds.h /usr/lpp/mmfs/src/include/cxi/DirIds.h
touch install.he
make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-kxi'
installing header files (/usr/lpp/mmfs/src/ibm-linux)
make[1]: Entering directory '/usr/lpp/mmfs/src/ibm-linux'
+ /usr/bin/install cxiTypes-plat.h /usr/lpp/mmfs/src/include/cxi/cxiTypes-plat.h
+ /usr/bin/install cxiSystem-plat.h /usr/lpp/mmfs/src/include/cxi/cxiSystem-plat.h
+ /usr/bin/install cxiIOBuffer-plat.h /usr/lpp/mmfs/src/include/cxi/cxiIOBuffer-plat.h
+ /usr/bin/install cxiSharedSeg-plat.h /usr/lpp/mmfs/src/include/cxi/cxiSharedSeg-plat.h
+ /usr/bin/install cxiMode-plat.h /usr/lpp/mmfs/src/include/cxi/cxiMode-plat.h
+ /usr/bin/install Trace-plat.h /usr/lpp/mmfs/src/include/cxi/Trace-plat.h
+ /usr/bin/install cxiAtomic-plat.h /usr/lpp/mmfs/src/include/cxi/cxiAtomic-plat.h
+ /usr/bin/install cxiMmap-plat.h /usr/lpp/mmfs/src/include/cxi/cxiMmap-plat.h
+ /usr/bin/install cxiVFSStats-plat.h /usr/lpp/mmfs/src/include/cxi/cxiVFSStats-plat.h
+ /usr/bin/install cxiCred-plat.h /usr/lpp/mmfs/src/include/cxi/cxiCred-plat.h
+ /usr/bin/install cxiDmapi-plat.h /usr/lpp/mmfs/src/include/cxi/cxiDmapi-plat.h
touch install.he
make[1]: Leaving directory '/usr/lpp/mmfs/src/ibm-linux'
installing header files (/usr/lpp/mmfs/src/gpl-linux)
make[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'
Making directory /usr/lpp/mmfs/src/include/gpl-linux
+ /usr/bin/install Shark-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/Shark-gpl.h
+ /usr/bin/install prelinux.h /usr/lpp/mmfs/src/include/gpl-linux/prelinux.h
+ /usr/bin/install postlinux.h /usr/lpp/mmfs/src/include/gpl-linux/postlinux.h
+ /usr/bin/install linux2gpfs.h /usr/lpp/mmfs/src/include/gpl-linux/linux2gpfs.h
+ /usr/bin/install verdep.h /usr/lpp/mmfs/src/include/gpl-linux/verdep.h
+ /usr/bin/install Logger-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/Logger-gpl.h
+ /usr/bin/install arch-gpl.h /usr/lpp/mmfs/src/include/gpl-linux/arch-gpl.h
+ /usr/bin/install oplock.h /usr/lpp/mmfs/src/include/gpl-linux/oplock.h
touch install.he
make[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'
make[1]: Entering directory '/usr/lpp/mmfs/src/gpl-linux'
Pre-kbuild step 1...
Pre-kbuild step 2...
touch install.he
Invoking Kbuild...
/usr/bin/make -C /usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64 ARCH=x86_64 M=/usr/lpp/mmfs/src/gpl-linux CONFIGDIR=/usr/lpp/mmfs/src/config  ; \
if [ $? -ne 0 ]; then \
        exit 1;\
fi
make[2]: Entering directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'
Makefile:977: *** "Cannot generate ORC metadata for CONFIG_UNWINDER_ORC=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel".  Stop.
make[2]: Leaving directory '/usr/src/kernels/4.18.0-147.8.1.el8_1.x86_64'
make[1]: *** [makefile:131: modules] Error 1
make[1]: Leaving directory '/usr/lpp/mmfs/src/gpl-linux'
make: *** [makefile:148: Modules] Error 1
--------------------------------------------------------
mmbuildgpl: Building GPL module failed at Thu Apr 16 21:29:37 CEST 2020.
--------------------------------------------------------
mmbuildgpl: Command failed. Examine previous error messages to determine cause.
[root@spectrumscale ~]#

Installation of missing RPM:

[root@spectrumscale ~]# yum install elfutils-libelf-devel
Last metadata expiration check: 2:48:13 ago on Thu 16 Apr 2020 06:42:52 PM CEST.
Dependencies resolved.
========================================================================================================================
 Package                              Arch                  Version                         Repository             Size
========================================================================================================================
Installing:
 elfutils-libelf-devel                x86_64                0.176-5.el8                     BaseOS                 54 k
Upgrading:
 elfutils-libelf                      x86_64                0.176-5.el8                     BaseOS                211 k
 elfutils-libs                        x86_64                0.176-5.el8                     BaseOS                321 k
Installing dependencies:
 zlib-devel                           x86_64                1.2.11-10.el8                   BaseOS                 56 k

Transaction Summary
========================================================================================================================
Install  2 Packages
Upgrade  2 Packages

Total download size: 643 k
Is this ok [y/N]: y
Downloading Packages:
(1/4): elfutils-libelf-devel-0.176-5.el8.x86_64.rpm                                      42 kB/s |  54 kB     00:01
(2/4): zlib-devel-1.2.11-10.el8.x86_64.rpm                                               44 kB/s |  56 kB     00:01
(3/4): elfutils-libelf-0.176-5.el8.x86_64.rpm                                           125 kB/s | 211 kB     00:01
(4/4): elfutils-libs-0.176-5.el8.x86_64.rpm                                             302 kB/s | 321 kB     00:01
------------------------------------------------------------------------------------------------------------------------
Total                                                                                   178 kB/s | 643 kB     00:03
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                1/1
  Upgrading        : elfutils-libelf-0.176-5.el8.x86_64                                                             1/6
  Installing       : zlib-devel-1.2.11-10.el8.x86_64                                                                2/6
  Installing       : elfutils-libelf-devel-0.176-5.el8.x86_64                                                       3/6
  Upgrading        : elfutils-libs-0.176-5.el8.x86_64                                                               4/6
  Cleanup          : elfutils-libs-0.174-6.el8.x86_64                                                               5/6
  Cleanup          : elfutils-libelf-0.174-6.el8.x86_64                                                             6/6
  Running scriptlet: elfutils-libelf-0.174-6.el8.x86_64                                                             6/6
  Verifying        : elfutils-libelf-devel-0.176-5.el8.x86_64                                                       1/6
  Verifying        : zlib-devel-1.2.11-10.el8.x86_64                                                                2/6
  Verifying        : elfutils-libelf-0.176-5.el8.x86_64                                                             3/6
  Verifying        : elfutils-libelf-0.174-6.el8.x86_64                                                             4/6
  Verifying        : elfutils-libs-0.176-5.el8.x86_64                                                               5/6
  Verifying        : elfutils-libs-0.174-6.el8.x86_64                                                               6/6

Upgraded:
  elfutils-libelf-0.176-5.el8.x86_64                          elfutils-libs-0.176-5.el8.x86_64

Installed:
  elfutils-libelf-devel-0.176-5.el8.x86_64                        zlib-devel-1.2.11-10.el8.x86_64

Complete!
[root@spectrumscale ~]#

After that the Ansilble role proceeds as expected.

Configure performance monitoring in federated mode if > 1 node in the cluster

Need to configure performance monitoring to best practice and in redundant federated mode with multiple collector nodes.

4 node cluster
1 node marked as MGMT GUI (which flags as pmcollector)

# hosts:
[cluster01]
node-vm1 scale_cluster_quorum=true   scale_cluster_manager=true scale_cluster_gui=true
node-vm3 scale_cluster_quorum=true   scale_cluster_manager=true scale_cluster_gui=false
node-vm2 scale_cluster_quorum=true   scale_cluster_manager=true scale_cluster_gui=false
node-vm4 scale_cluster_quorum=false   scale_cluster_manager=false scale_cluster_gui=false

[root@node-vm1 ibm-spectrum-scale-install-infra]# mmdsh -f1 -N all rpm -qa | grep pmcollector
node-vm1.tuc.stglabs.ibm.com:  gpfs.gss.pmcollector-5.0.5-0.el8.x86_64
[root@node-vm1 ibm-spectrum-scale-install-infra]# mmdsh -f1 -N all rpm -qa | grep pmsensors
node-vm1.tuc.stglabs.ibm.com:  gpfs.gss.pmsensors-5.0.5-0.el8.x86_64
node-vm2.tuc.stglabs.ibm.com:  gpfs.gss.pmsensors-5.0.5-0.el8.x86_64
node-vm3.tuc.stglabs.ibm.com:  gpfs.gss.pmsensors-5.0.5-0.el8.x86_64
node-vm4.tuc.stglabs.ibm.com:  gpfs.gss.pmsensors-5.0.5-0.el8.x86_64

**Only one node is setup as a collector when there should be 2 nodes for redundancy:**
'''[root@node-vm1 ibm-spectrum-scale-install-infra]# /usr/lpp/mmfs/bin/mmperfmon config show | grep -i col
colCandidates = "node-vm1.tuc.stglabs.ibm.com"
colRedundancy = 1
collectors = {

Update to the single directory method in the README

Additional updates to README in issue #50

Add which RPM's are required vs optional.
Additionaly, consider adding the fact that the directory path identified by the admin in the scale_install_directory_pkg_path gets copied to /usr/lpp/mmfs/code_version path as well for installation.


    Installation from (existing) YUM repository(scale_install_repository_url)
    Installation from remote installation package (scale_install_remotepkg_path)
    Installation from local installation package (scale_install_localpkg_path)
    Installation from single directory package path (scale_install_directory_pkg_path)

    Important: If you are using the single directory installation method(scale_install_directory_pkg_path), you need to keep all required GPFS RPMs in a single user-provided directory.

The regex_replace call fails when nsds are not defined

On the master branch, when I'm running without NSD defined, I hit this :

TASK [core/cluster : storage | Find defined NSDs] ******************************
fatal: [worker0]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'nsd'\n\nThe error appears to be in '/root/ibm-spectrum-scale-install-infra/roles/core/cluster/tasks/storage.yml': line 46, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: storage | Find defined NSDs\n  ^ here\n"}
fatal: [worker1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'nsd'\n\nThe error appears to be in '/root/ibm-spectrum-scale-install-infra/roles/core/cluster/tasks/storage.yml': line 46, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: storage | Find defined NSDs\n  ^ here\n"}
fatal: [worker2]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'nsd'\n\nThe error appears to be in '/root/ibm-spectrum-scale-install-infra/roles/core/cluster/tasks/storage.yml': line 46, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: storage | Find defined NSDs\n  ^ here\n"}
fatal: [worker3]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'nsd'\n\nThe error appears to be in '/root/ibm-spectrum-scale-install-infra/roles/core/cluster/tasks/storage.yml': line 46, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: storage | Find defined NSDs\n  ^ here\n"}

This seems to be caused by this line https://github.com/IBM/ibm-spectrum-scale-install-infra/blob/master/roles/core/cluster/tasks/storage.yml#L49 which I made before we pushed the code into the public space.

I will submit a PR to delete that.

mmlsmount command behaves differently after successfull deployment.

Scenario :

  1. Deployment got successfully.
    2.Our playbook is running command mmmount <filesystem_name> -a
    it will mount on all nodes in a cluster.
  2. I can see , First time it was showing mounted on all nodes.
  3. Then its started changing ,
    *File system fs1 is mounted on 2 nodes.
    File system fs2 is mounted on 1 nodes.

We will have to investigate what exact issue is here.

discrepency in logging with FQDN and shortname with tasks

recreation steps:

first i deployed a single node cluster (vm1 )with core gpfs functionality, then added vm3 for nsd along with GUI and zimon feature, on the second run observed that there is discrepency with FQDN and shortname being used for nodes.

TASK [precheck : configure | check gui node] ************************************************************************************************************************************
ok: [snowwraith-vm3] => (item=snowwraith-vm1)
skipping: [snowwraith-vm3] => (item=snowwraith-vm3)
skipping: [snowwraith-vm3] => (item=snowwraith-vm1.tuc.stglabs.ibm.com)

TASK [precheck : cluster | check if gui is enabled] *****************************************************************************************************************************
ok: [snowwraith-vm3] => (item=snowwraith-vm1)
ok: [snowwraith-vm3] => (item=snowwraith-vm3)
skipping: [snowwraith-vm3] => (item=snowwraith-vm1.tuc.stglabs.ibm.com)

.
.
.
.

TASK [precheck : debug] *********************************************************************************************************************************************************
ok: [snowwraith-vm3] => {
    "msg": "set -o pipefail && mmlsnodeclass GUI_MGMT_SERVERS -Y |grep -v HEADER | cut -d ':' -f 10"
}




TASK [node : include_tasks] *****************************************************************************************************************************************************
included: /root/ibm-spectrum-scale-install-infra/roles/gui/node/tasks/install_local_pkg.yml for snowwraith-vm1, snowwraith-vm3

TASK [node : install | Stat local installation package] *************************************************************************************************************************
ok: [snowwraith-vm1 -> localhost]

TASK [node : install | Check local installation package] ************************************************************************************************************************
ok: [snowwraith-vm1 -> localhost] => {
    "changed": false,
    "msg": "All assertions passed"
}


TASK [node : install | Stat extracted packages] *********************************************************************************************************************************
ok: [snowwraith-vm3]
ok: [snowwraith-vm1]

TASK [node : install | Stat temporary directory] ********************************************************************************************************************************
skipping: [snowwraith-vm1]
skipping: [snowwraith-vm3]

TASK [node : install | Check temporary directory] *******************************************************************************************************************************
skipping: [snowwraith-vm1]
skipping: [snowwraith-vm3]

TASK [node : install | Copy installation package to node] ***********************************************************************************************************************
skipping: [snowwraith-vm1]
skipping: [snowwraith-vm3]

TASK [node : install | Extract installation package] ****************************************************************************************************************************
ok: [snowwraith-vm1]
ok: [snowwraith-vm3]

TASK [node : install | Stat extracted packages] *********************************************************************************************************************************
ok: [snowwraith-vm1]
ok: [snowwraith-vm3]

TASK [node : install | Check extracted packages] ********************************************************************************************************************************
ok: [snowwraith-vm1] => {
    "changed": false,
    "msg": "All assertions passed"
}
ok: [snowwraith-vm3] => {
    "changed": false,
    "msg": "All assertions passed"
}

TASK [node : install | Delete installation package from node] *******************************************************************************************************************
ok: [snowwraith-vm1]
ok: [snowwraith-vm3]



TASK [zimon/cluster : configure | Find zimon collector nodes] *******************************************************************************************************************
ok: [snowwraith-vm1] => (item=snowwraith-vm1)
skipping: [snowwraith-vm1] => (item=snowwraith-vm3)
skipping: [snowwraith-vm1] => (item=snowwraith-vm1.tuc.stglabs.ibm.com)

TASK [zimon/cluster : cluster | check if zimon is enabled] **********************************************************************************************************************
ok: [snowwraith-vm1] => (item=snowwraith-vm1)
ok: [snowwraith-vm1] => (item=snowwraith-vm3)
skipping: [snowwraith-vm1] => (item=snowwraith-vm1.tuc.stglabs.ibm.com)


TASK [gui/cluster : SNMP | Find collector nodes] ********************************************************************************************************************************
skipping: [snowwraith-vm1] => (item=snowwraith-vm1)
skipping: [snowwraith-vm1] => (item=snowwraith-vm3)
skipping: [snowwraith-vm1] => (item=snowwraith-vm1.tuc.stglabs.ibm.com)


Enable provision to set NSD usage via scale_clusterdefinition.json

mmcrnsd stanza file format;

%nsd: device=DiskName
  nsd=NsdName
  servers=ServerList
  usage={dataOnly | metadataOnly | dataAndMetadata | descOnly | localCache}
  failureGroup=FailureGroup
  pool=StoragePool

Cloud architecture requires to create "descOnly" NSD for failture group balancing and NVME local drives, requires "localCache". Currently, scale_clusterdefintion.json lacks a parameter to distinguish NSD usage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.