Coder Social home page Coder Social logo

k8s-storage-tests's Introduction

Storage Validation Tool for IBM Cloud Paks

Kubernetes has gained a lot of momentum with storage vendors providing support on various container orchestration platforms with CSI drivers and other mechanisms.

It has become essential for platform administrators to quickly validate a storage platform for their modernized workloads on IBM Cloud Paks and check its readiness level.

This Ansible Playbook helps functionally validate a storage on ReadWriteOnce and ReadWriteMany volumes. Note that these tests covers readiness and are only meant to be a pre-cursor to a full blown test with actual Cloud Pak workloads.

Note that: if the tests in this storage readiness project are successful, it's strongly recommended that you continue to perform further performance tests on the storage by following this companion project at https://github.com/IBM/k8s-storage-perf, and perform the tests provided there. It will give you a good assessment of the particular storage performance.

The following tests are performed:

  • Dynamic provisioning of a volume
  • Mounting volume from a node
  • Sequential Read Write Consistency from single and multiple nodes
  • Parallel Read Write Consistency from single and multiple nodes
  • Parallel Read Write Consistency across multiple threads
  • File Permissions on mounted volumes
  • Accessibility based on POSIX compliance Group ID Permissions
  • SubPath test for volumes
  • File Locking test

Prerequisites

  • Ensure you have python 3.6 or later and pip 21.1.3 or later installed

    python --version

    pip --version

    NB: if your python interpreter is using python3 or python37 or other Python 3 executables, you can create a symlink for python using this command

    ln -s -f /usr/bin/python3 /usr/bin/python
    
    # OR depends on the Python 3 installation location
    
    ln -s -f /usr/local/bin/python3 /usr/local/bin/python
    

    NB: if pip is not available or is an older version, run the command below to upgrade it, and then check its version again. If pip command can't be found after the below command, add /usr/local/bin into your PATH ENV variable.

    python -m pip install --upgrade pip

  • Install Ansible 2.10.5 or later

    pip install ansible==2.10.5

  • Install ansible k8s modules

    pip install openshift

    ansible-galaxy collection install operator_sdk.util

    ansible-galaxy collection install kubernetes.core

    NB: the openshift package installation requires PyYAML >= 5.4.1, and if the existing PyYAML is an older version, then PyYAML's installation will fail. To overcome this issue, manually delete the exsiting PyYAML package as below (adjust the paths in the commands according to the your host environment):

    rm -rf /usr/lib64/python3.6/site-packages/yaml
    rm -f  /usr/lib64/python3.6/site-packages/PyYAML-*
    
  • Install OpenShift Client 4.6 or later based on your OS.

  • Access to the OpenShift Cluster (at least 3 compute nodes) setup with RWX and RWO storage classes with cluster admin access.

Setup

  • Clone this git repo to your client
  git clone https://github.com/IBM/k8s-storage-tests.git
  • Update the params.yml file with your OCP URL and Credentials

     ocp_url: https://<required>:6443
     ocp_username: <required>
     ocp_password: <required>
     ocp_token: <required if user/password not available>
    
  • Update the params.yml file for the required storage parameters

    storageClass_ReadWriteOnce: <required>
    storageClass_ReadWriteMany: <required>
    storage_validation_namespace: <required>
    

Running the Playbook

  • From the root of this repository, run:
  ansible-playbook main.yml --extra-vars "@./params.yml" | tee output.log

If the playbook fails to run due to SSL verification error, you can disable it by setting this environment variable before running the playbook

export K8S_AUTH_VERIFY_SSL=no

Running the Playbook with the Container

Environment Setup

export dockerexe=podman # or docker
export container_name=k8s-storage-test
export docker_image=icr.io/cpopen/cpd/k8s-storage-test:v1.0.0

alias k8s_storage_test_exec="${dockerexe} exec ${container_name}"
alias run_k8s_storage_test="k8s_storage_test_exec ansible-playbook main.yml --extra-vars \"@/tmp/work-dir/params.yml\" | tee output.log"
alias run_k8s_storage_test_cleanup="k8s_storage_test_exec cleanup.sh -n ${NAMESPACE} -d"

Start the Container

mkdir -p /tmp/k8s_storage_test/work-dir
cp ./params.yml /tmp/k8s_storage_test/work-dir/params.yml

${dockerexe} pull ${docker_image}
${dockerexe} run --name ${container_name} -d -v /tmp/k8s_storage_test/work-dir:/tmp/work-dir ${docker_image}

Run the Playbook

run_k8s_storage_test

Optional Cleanup the Cluster

run_k8s_storage_test_cleanup

[INFO ] running clean up for namespace storage-validation-1 and the namespace will be deleted
[INFO ] please run the following command in a terminal that has access to the cluster to clean up after the ansible playbooks

oc get job -n storage-validation-1 -o name | xargs -I % -n 1 oc delete % -n storage-validation-1 && \
oc get pvc -n storage-validation-1 -o name | xargs -I % -n 1 oc delete % -n storage-validation-1 && \
oc get cm -n storage-validation-1 -o name | xargs -I % -n 1 oc delete % -n storage-validation-1 && \
oc delete ns storage-validation-1 --ignore-not-found && \
oc delete scc zz-fsgroup-scc --ignore-not-found

[INFO ] cleanup script finished with no errors

Verifying your results

Regardless of whether you run the Playbook or use the Container, on a successful run, you should see the following output:

 ######################## MOUNT TESTS PASSED FOR ReadWriteOnce Volume  #################################
 ######################## MOUNT TESTS PASSED FOR ReadWriteMany Volume  #################################
 ######################## SEQUENTIAL READ WRITE TEST PASSED FOR ReadWriteOnce Volume ###################
 ######################## SEQUENTIAL READ WRITE TEST PASSED FOR ReadWriteMany Volume ###################
 ######################## SINGLE THREAD PARALLEL READ WRITE TEST PASSED for ReadWriteOnce ##############
 ######################## SINGLE THREAD PARALLEL READ WRITE TEST PASSED for ReadWriteMany ##############
 ######################## MULTI NODE PARALLEL READ WRTIE TEST PASSED FOR ReadWriteOnce #################
 ######################## MULTI NODE PARALLEL READ WRTIE TEST PASSED FOR ReadWriteMany #################
 ######################## FILE UID TEST PASSED FOR ReadWriteMany Volume ################################
 ######################## FILE PERMISSIONS TEST PASSED FOR ReadWriteMany Volume ########################
 ######################## FILE PERMISSIONS TEST PASSED FOR ReadWriteOnce Volume ########################
 ######################## SUB PATH TEST PASSED FOR ReadWriteMany Volume ################################
 ######################### FILE LOCK TESTS PASSED FOR ReadWriteMany Volume #############################
 PLAY RECAP *********************************************************************
 localhost                  : ok=109  changed=42   unreachable=0    failed=0    skipped=7    rescued=0    ignored=0   

Clean-up Resources

Delete the kuberbetes namespace that you created in Setup, you can also run these commands to clean up the resources in the namespace

oc delete job $(oc get jobs -n <storage_validation_namespace> | awk '{ print $1 }') -n <storage_validation_namespace>
oc delete cm $(oc get cm -n <storage_validation_namespace> | awk '{ print $1 }') -n <storage_validation_namespace>
oc delete pvc $(oc get pvc -n <storage_validation_namespace> | awk '{ print $1 }') -n <storage_validation_namespace>
oc delete scc zz-fsgroup-scc

OR

oc delete project <storage_validation_namespace>
oc delete scc zz-fsgroup-scc

k8s-storage-tests's People

Contributors

imgbotapp avatar jrhee12 avatar pamandrejko avatar parthakom2 avatar shankarpentyala07 avatar stevemar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

k8s-storage-tests's Issues

Unable to run test-suite on python3.9

  • On attempting to run the test-suite with python3.9 , the following bug was encountered:
(.venv) alberto@thinkpad $ ansible-playbook main.yml --extra-vars "@./params.yml" | tee output.log
ERROR! Unexpected Exception, this is probably a bug: Failed to detect selinux python bindings at ['/usr/local/lib64/python3.9/site-packages', '/usr/local/lib/python3.9/site-packages', '/usr/lib64/python3.9/site-packages', '/usr/lib/python3.9/site-packages']
the full traceback was:

Traceback (most recent call last):
  File "/home/alberto/k8s-storage-tests/.venv/bin/ansible-playbook", line 92, in <module>
    mycli = getattr(__import__("ansible.cli.%s" % sub, fromlist=[myclass]), myclass)
  File "/home/alberto/k8s-storage-tests/.venv/lib64/python3.9/site-packages/ansible/cli/__init__.py", line 24, in <module>
    from ansible.parsing.dataloader import DataLoader
  File "/home/alberto/k8s-storage-tests/.venv/lib64/python3.9/site-packages/ansible/parsing/dataloader.py", line 17, in <module>
    from ansible.module_utils.basic import is_executable
  File "/home/alberto/k8s-storage-tests/.venv/lib64/python3.9/site-packages/ansible/module_utils/basic.py", line 77, in <module>
    import selinux
  File "/home/alberto/k8s-storage-tests/.venv/lib64/python3.9/site-packages/selinux/__init__.py", line 106, in <module>
    check_system_sitepackages()
  File "/home/alberto/k8s-storage-tests/.venv/lib64/python3.9/site-packages/selinux/__init__.py", line 102, in check_system_sitepackages
    raise Exception(
Exception: Failed to detect selinux python bindings at ['/usr/local/lib64/python3.9/site-packages', '/usr/local/lib/python3.9/site-packages', '/usr/lib64/python3.9/site-packages', '/usr/lib/python3.9/site-packages']
(.venv) alberto@thinkpad $ python -V
Python 3.9.16
  • It was not so for version 3.6:
(.venv36) alberto@thinkpad $ ansible-playbook main.yml --extra-vars "@./params.yml" | tee output.log
/home/alberto/k8s-storage-tests/.venv36/lib64/python3.6/site-packages/ansible/parsing/vault/__init__.py:44: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography. The next release of cryptography will remove support for Python 3.6.
  from cryptography.exceptions import InvalidSignature
[WARNING]: provided hosts list is empty, only localhost is available. Note that
the implicit localhost does not match 'all'

PLAY [localhost] ***************************************************************

TASK [ocp login using creds] ***************************************************
skipping: [localhost]

TASK [ocp login using token] ***************************************************
changed: [localhost]

TASK [debug] *******************************************************************
skipping: [localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "login_token.stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.ocp-d.cpst-lab.ibm.com:6443\" as \"kube:admin\" using the token provided.",
        "",
        "You have access to 83 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"default\"."
    ]
}
. . .
(.venv36) alberto@thinkpad $ python -V
Python 3.6.8
  • Nor for version 3.10:
(.venv310) alberto@thinkpad $ ansible-playbook main.yml --extra-vars "@./params.yml" | tee output.log
[WARNING]: provided hosts list is empty, only localhost is available. Note that
the implicit localhost does not match 'all'

PLAY [localhost] ***************************************************************

TASK [ocp login using creds] ***************************************************
skipping: [localhost]

TASK [ocp login using token] ***************************************************
changed: [localhost]

TASK [debug] *******************************************************************
skipping: [localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "login_token.stdout_lines": [
        "WARNING: Using insecure TLS client config. Setting this option is not supported!",
        "",
        "Logged into \"https://api.ocp-d.cpst-lab.ibm.com:6443\" as \"kube:admin\" using the token provided.",
        "",
        "You have access to 84 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"default\"."
    ]
}

TASK [Storage Readiness] *******************************************************
[WARNING]: Collection kubernetes.core does not support Ansible version 2.10.17

TASK [storage-readiness : Create namespace storage-tests if not present] *******

. . . 
(.venv310) alberto@thinkpad $ python -V
Python 3.10.9

Some tests fail also with supported storage

I tried this tool on two of my clusters. VPC ROKS with ODF didn't pass all the test (RWO BLOCK ocs-storagecluster-ceph-rbd and RWX FILE ocs-storagecluster-cephfs)

    "msg": "######################## MOUNT TESTS PASSED FOR ReadWriteOnce Volume  #################################"
    "msg": "######################## MOUNT TESTS PASSED FOR ReadWriteMany Volume  #################################"
    "msg": "######################## SEQUENTIAL READ WRITE TEST PASSED FOR ReadWriteOnce Volume #################################"
    "msg": "######################## SEQUENTIAL READ WRITE TEST PASSED FOR ReadWriteMany Volume #################################"
    "msg": "######################## SINGLE THREAD PARALLEL READ WRITE TEST PASSED for ReadWriteOnce #################################"
    "msg": "######################## SINGLE THREAD PARALLEL READ WRITE TEST PASSED for ReadWriteMany #################################"
    "msg": "######################## MULTI NODE PARALLEL READ WRTIE TEST PASSED FOR ReadWriteOnce #################################"
    "msg": "######################## MULTI NODE PARALLEL READ WRTIE TEST PASSED FOR ReadWriteMany #################################"
    "msg": "######################## FILE PERMISSIONS TEST PASSED FOR ReadWriteMany Volume #################################"
    "msg": "######################## SUB PATH TEST PASSED FOR ReadWriteMany Volume #################################"
    "msg": "######################## FILE LOCK TESTS PASSED FOR ReadWriteMany Volume #################################"
    "msg": "File UID permission tests for ReadWriteMany - FAILED"
    "msg": "FSGroup GID permission tests for ReadWriteOnce - FAILED"

On the other hand a simple NFS using managed-nfs-storage on my RHEL connected to OCP on vmware fails only one

    "msg": "######################## MOUNT TESTS PASSED FOR ReadWriteOnce Volume  #################################"
    "msg": "######################## MOUNT TESTS PASSED FOR ReadWriteMany Volume  #################################"
    "msg": "######################## SEQUENTIAL READ WRITE TEST PASSED FOR ReadWriteOnce Volume #################################"
    "msg": "######################## SEQUENTIAL READ WRITE TEST PASSED FOR ReadWriteMany Volume #################################"
    "msg": "######################## SINGLE THREAD PARALLEL READ WRITE TEST PASSED for ReadWriteOnce #################################"
    "msg": "######################## SINGLE THREAD PARALLEL READ WRITE TEST PASSED for ReadWriteMany #################################"
    "msg": "######################## MULTI NODE PARALLEL READ WRTIE TEST PASSED FOR ReadWriteOnce #################################"
    "msg": "######################## MULTI NODE PARALLEL READ WRTIE TEST PASSED FOR ReadWriteMany #################################"
    "msg": "######################## FILE PERMISSIONS TEST PASSED FOR ReadWriteMany Volume #################################"
    "msg": "######################## FILE PERMISSIONS TEST PASSED FOR ReadWriteOnce Volume #################################"
    "msg": "######################## SUB PATH TEST PASSED FOR ReadWriteMany Volume #################################"
    "msg": "######################## FILE LOCK TESTS PASSED FOR ReadWriteMany Volume #################################"
    "msg": "File UID permission tests for ReadWriteMany - FAILED"

I must say these are unexpected results. I expected ODF to pass everything. Is there an issue with the verification tool or with my ODF?
I saw at https://www.ibm.com/docs/en/cloud-paks/cp-data/4.6.x?topic=requirements-storage#compute-requirements__control-plane-persistent-stg that managed-nfs-storage is also acceptable but still fails one test. Same question as for ODF.
You can also reach me on internal IBM Slack @jdusek

Testing to Capture Multiple UIDs Writing to File System

Main Issue: If two processes with different non-root UIDs write to an EFS mount, the file ownership of one of the UIDs doesn't get maintained. It will result in one UID always be recorded in the file system. This has been observed during the EFS CSI driver certification process for Db2.

Request: EFS CSI driver needs to support two userIDs writing to a file system. The request is to include a testing feature that tests for the ability to handle maintaining mulitple UIDs in this scenario.

Add the section "Pulling and loading the required image in airgap environment"

To support the airgap environment, the following images should be pulled and pushed to the private image registry.

    icr.io/cpopen/cpd/k8s-storage-test:v1.0.0
    quay.io/ibm-cp4d-public/storage-gcc:1.0
    quay.io/ibm-cp4d-public/storage-gcc:1.0-amd64
    quay.io/ibm-cp4d-public/storage-util:1.0
    quay.io/ibm-cp4d-public/storage-util:1.0-amd64
    quay.io/centos/amd64:latest

Can we add on section like this? https://github.com/IBM/k8s-storage-perf#pulling-and-loading-the-required-image-in-airgap-environment

Thanks!

Storage validation tests are failing on Openshift 4.10 on ARO with ODF (self-managed) storage

My ARO clusters:
pk6-aro
url: https://console-openshift-console.apps.oeq8pl8k.eastus.aroapp.io/
Contact me for the credentials

pk7-aro
url: https://console-openshift-console.apps.jce5djuj.eastus.aroapp.io/k8s/cluster/storageclasses
az aro list-credentials \
--name pk7-aro
--resource-group cpbu-sdlc-rg
Contact me for the credentials

Storage Validation tests for ODF storage classes are failing with:

 run_k8s_storage_test

PLAY [localhost] ***************************************************************

TASK [ocp login using creds] ***************************************************
changed: [localhost]

TASK [ocp login using token] ***************************************************
skipping: [localhost]

TASK [debug] *******************************************************************
ok: [localhost] => {
    "login_creds.stdout_lines": [
        "Login successful.",
        "",
        "You have access to 70 projects, the list has been suppressed. You can list all projects with 'oc projects'",
        "",
        "Using project \"default\".",
        "Welcome! See 'oc help' to get started."
    ]
}

TASK [debug] *******************************************************************
skipping: [localhost]

TASK [Storage Readiness] *******************************************************

TASK [storage-readiness : Create namespace k8s-validation if not present] ******
ok: [localhost]

TASK [storage-readiness : Run simple mount test for ReadWriteOnce] *************
included: /opt/ansible/roles/storage-readiness/tasks/mount-test.yaml for localhost

TASK [storage-readiness : Test mount ReadWriteOnce volumes for readiness] ******
failed: [localhost] (item={'name': 'create-volume.yaml.j2'}) => {"ansible_loop_var": "item", "changed": false, "error": 422, "item": {"name": "create-volume.yaml.j2"}, "msg": "Failed to patch object: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"PersistentVolumeClaim \\\\\"readiness-readwriteonce\\\\\" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims\\\\n\\xc2\\xa0\\xc2\\xa0core.PersistentVolumeClaimSpec{\\\\n\\xc2\\xa0\\xc2\\xa0\\\\t... // 2 identical fields\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tResources:        {Requests: {s\\\\\"storage\\\\\": {i: {...}, s: \\\\\"1Gi\\\\\", Format: \\\\\"BinarySI\\\\\"}}},\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tVolumeName:       \\\\\"\\\\\",\\\\n-\\xc2\\xa0\\\\tStorageClassName: \\\\u0026\\\\\"ocs-storagecluster-ceph-rbd\\\\\",\\\\n+\\xc2\\xa0\\\\tStorageClassName: \\\\u0026\\\\\"rook-ceph-block\\\\\",\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tVolumeMode:       \\\\u0026\\\\\"Filesystem\\\\\",\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tDataSource:       nil,\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tDataSourceRef:    nil,\\\\n\\xc2\\xa0\\xc2\\xa0}\\\\n\",\"reason\":\"Invalid\",\"details\":{\"name\":\"readiness-readwriteonce\",\"kind\":\"PersistentVolumeClaim\",\"causes\":[{\"reason\":\"FieldValueForbidden\",\"message\":\"Forbidden: spec is immutable after creation except resources.requests for bound claims\\\\n\\xc2\\xa0\\xc2\\xa0core.PersistentVolumeClaimSpec{\\\\n\\xc2\\xa0\\xc2\\xa0\\\\t... // 2 identical fields\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tResources:        {Requests: {s\\\\\"storage\\\\\": {i: {...}, s: \\\\\"1Gi\\\\\", Format: \\\\\"BinarySI\\\\\"}}},\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tVolumeName:       \\\\\"\\\\\",\\\\n-\\xc2\\xa0\\\\tStorageClassName: \\\\u0026\\\\\"ocs-storagecluster-ceph-rbd\\\\\",\\\\n+\\xc2\\xa0\\\\tStorageClassName: \\\\u0026\\\\\"rook-ceph-block\\\\\",\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tVolumeMode:       \\\\u0026\\\\\"Filesystem\\\\\",\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tDataSource:       nil,\\\\n\\xc2\\xa0\\xc2\\xa0\\\\tDataSourceRef:    nil,\\\\n\\xc2\\xa0\\xc2\\xa0}\\\\n\",\"field\":\"spec\"}]},\"code\":422}\\n'", "reason": "Unprocessable Entity", "status": 422}
ok: [localhost] => (item={'name': 'mount-job.yaml.j2'})

PLAY RECAP *********************************************************************
localhost                  : ok=4    changed=1    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0   

params.yml
params.yml.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.