Coder Social home page Coder Social logo

cloud-bulldozer / benchmark-operator Goto Github PK

View Code? Open in Web Editor NEW
281.0 281.0 128.0 9.66 MB

The Chuck Norris of cloud benchmarks

License: Apache License 2.0

Dockerfile 0.20% Shell 17.98% Python 14.69% Smarty 0.54% Jinja 64.00% Makefile 2.58%
kubernetes kubernetes-operator openshift openshift-operator performance-testing

benchmark-operator's People

Contributors

aakarshg avatar acalhounrh avatar akrzos avatar amitsagtani97 avatar asispatra avatar avilir avatar bengland2 avatar chentex avatar dry923 avatar dustinblack avatar ebattat avatar ekuric avatar hughnhan avatar jeniferh avatar jtaleric avatar keesturam avatar martineg avatar mkarg75 avatar mohit-sheth avatar morenod avatar mukrishn avatar mulbc avatar paigerube14 avatar robertkrawitz avatar rsevilla87 avatar sarahbx avatar shekharberry avatar smalleni avatar svetsa-rh avatar vishnuchalla avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

benchmark-operator's Issues

Changes in main.yaml under fio-bench/tasks not reflected

Hi,

I was trying to tweak main.yaml to understand how Fio-bench works but any change done to main.yaml is not reflected once I deploy the CR.
Say for eg:
Even If i just change the name of Configmap to fio-test_shekhar and redeploy the CR the configmap is still created with name fio-test.

name: Generate fio test
k8s:
definition:
apiVersion: v1
kind: ConfigMap
metadata:
name: fio-test_shekhar
namespace: '{{ meta.namespace }}'
data:
fiojob: "{{ lookup('template', 'job.fio.seq_write') }}"
when: fio.clients > 0

oc get configmap
NAME DATA AGE
benchmark-operator-lock 0 3h
fio-test 1 1h

Am I missing something??

[pgbench] pin_node is NOT optional in ripsaw_v1alpha1_pgbench_cr.yaml

Leaving pin-node blank caused the client pod to not create. Adding a value such as pin_node: "ip-10-0-134-143" in ripsaw_v1alpha1_pgbench_cr.yaml fixed the problem and the test ran as expected after this change.

$ oc get benchmark
NAME                TYPE      AGE
pgbench-benchmark   pgbench   56s
[ec2-user@ip-172-31-14-128 ripsaw-dustin]$ oc describe benchmark pgbench-benchmark
Name:         pgbench-benchmark
Namespace:    ripsaw
Labels:       <none>
Annotations:  <none>
API Version:  ripsaw.cloudbulldozer.io/v1alpha1
Kind:         Benchmark
Metadata:
  Creation Timestamp:  2019-07-08T23:39:49Z
  Generation:          1
  Resource Version:    91783
  Self Link:           /apis/ripsaw.cloudbulldozer.io/v1alpha1/namespaces/ripsaw/benchmarks/pgbench-benchmark
  UID:                 ad4debce-a1d9-11e9-acf0-0268146ce15c
Spec:
  Workload:
    Args:
      Clients:
        4
        8
      cmd_flags:  
      Databases:
        db_name:       sampledb
        Host:          172.30.86.214
        Password:      wTRfg5vxpmtfkYKA
        pin_node:      <nil>
        Port:          <nil>
        User:          user8LH
      init_cmd_flags:  
      run_time:        300
      Samples:         2
      scaling_factor:  30
      Threads:         4
      Timeout:         5
      Transactions:    <nil>
    Name:              pgbench
Status:
  Conditions:
    Last Transition Time:  2019-07-08T23:40:53Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                False
    Type:                  Running
    Ansible Result:
      Changed:             2
      Completion:          2019-07-08T23:40:56.93001
      Failures:            1
      Ok:                  5
      Skipped:             6
    Last Transition Time:  2019-07-08T23:40:57Z
    Message:               An unhandled exception occurred while running the lookup plugin 'template'. Error was a <class 'ansible.errors.AnsibleError'>, original message: Unexpected templating type error occurred on (---
kind: Job
apiVersion: batch/v1
metadata:
  name: '{{ meta.name }}-pgbench-client-{{ item.0|int + 1 }}'
  namespace: '{{ operator_namespace }}'
spec:
  ttlSecondsAfterFinished: 600
  template:
    metadata:
      labels:
        app: pgbench-client
    spec:
      containers:
      - name: benchmark
        image: "quay.io/cloud-bulldozer/pgbench:latest"
        command: ["/bin/sh", "-c"]
        args:
          - "export PGPASSWORD='{{ item.1.password }}';
             export pgbench_auth='-h {{ item.1.host }} -p {% if item.1.port is defined and item.1.port|int > 0 %} {{ item.1.port }} {% else %} {{ db_port }} {% endif %} -U {{ item.1.user }}';
             echo 'Init Database {{ item.1.host }}/{{ item.1.db_name }}';
             pgbench $pgbench_auth -i -s {{ pgbench.scaling_factor }} {{ pgbench.init_cmd_flags }} {{ item.1.db_name }};
             if [ $? -eq 0 ]; then
               echo 'Waiting for start signal...';
               redis-cli -h {{ bo.resources[0].status.podIP }} lpush pgb_client_ready {{ item.0|int }};
               while true; do
                 if [[ $(redis-cli -h {{ bo.resources[0].status.podIP }} get pgb_start) =~ 'true' ]]; then
                   echo 'GO!';
                 {% for clients in pgbench.clients %}
                   echo '';
                   echo 'Running PGBench with {{ clients }} clients on database {{ item.1.host }}/{{ item.1.db_name }}';
                   for i in `seq 1 {{ pgbench.samples|int }}`; do
                     echo \"Begin test sample $i of {{ pgbench.samples }}...\";
                     pgbench $pgbench_auth -c {{ clients }} -j {{ pgbench.threads }} {% if pgbench.transactions is defined and pgbench.transactions|int > 0 %} -t {{ pgbench.transactions }} {% elif pgbench.run_time is defined and pgbench.run_time|int > 0 %} -T {{ pgbench.run_time }} {% endif %} -s {{ pgbench.scaling_factor }} {{ pgbench.cmd_flags }} {{ item.1.db_name }};
                   done;
                 {% endfor %}
                 else
                   continue;
                 fi;
                 break;
               done;
             fi"
      restartPolicy: OnFailure
{% if item.1.pin_node is defined and item.1.pin_node|length and item.1.pin_node is not sameas "" %}
      nodeSelector:
        kubernetes.io/hostname: '{{ item.1.pin_node }}'
{% endif %}
): object of type 'NoneType' has no len()
    Reason:  Failed
    Status:  True
    Type:    Failure
Events:      <none>

Suggestion : Remove infrastructure from Ripsaw

Currently Ripsaw attempts to manage other operators, which has been a struggle.

Why did we have Ripsaw launch other operators?

We wanted to have the application be managed via the Operator framework for testing the workload, and to provide an end to end solution.

What is wrong with this approach?

It causes complexity that we continuously have to work around. Broken CI, which slows down our ability to accept PRs

What can we do to replace/remove this dependency for infrastructure?

Instead of Ripsaw managing other operators, we could have our CI/Testing use statefulsets to deploy applications like mongodb, and we simply update our YCSB CR to point at the mongodb deployed with statefulsets.

Allow for customizing CI tests based on environment and version

This is particularly important for infra roles. It may not be possible to create one test CR for a role that could be expected to pass on all of K8s upstream, OCP 3.0, and OCP 4.0. If the CI system is going to automate testing of multiple environment types, then we need the ability to create test CRs that are applicable to the specifics of the environment.

Consider restructuring deploy/

There is a handy k8s shortcut that allows you to kubectl/oc create -f <directory> and it will load all of the yaml files in that directory. The deployment instructions could be simplified by leveraging this shortcut, but the current deploy/ structure is problematic for this since the CRD needs to be loaded before the operator. If the CRD were in the deploy/ directory, then setup would be as simple as kubectl/oc create -f deploy and the operator would be up-and-running.

If the existing structure follows some established standard, then we could at least simplify the deployment instructions to say:

# oc create -f deploy/crds/bench_v1alpha1_bench_crd.yaml
# oc create -f deploy

CI wait_clean function doesn't actually do anything

In the CI test.sh script, we call the cleanup_resources function followed by the wait_clean function.

...
source tests/common.sh

trap cleanup_resources EXIT

wait_clean
...

The wait_clean function checks 30 times effectively for the operator pod to not be running.

function wait_clean {
  for i in {1..30}; do
    if [ `kubectl get pods --namespace ripsaw | grep bench | wc -l` -ge 1 ]; then
      sleep 5
    else
      break
    fi
  done
}

However, the cleanup_resources function does not include commands to stop the operator pod.

function cleanup_resources {
  echo "Exiting after cleanup of resources"
  kubectl delete -f resources/crds/ripsaw_v1alpha1_ripsaw_crd.yaml
  kubectl delete -f deploy
  marketplace_cleanup
}

This results in the wait_clean function simply running through its 30 iterations, doing nothing.

I'm not entirely sure of the intention, so I don't want to propose a solution, but I will point out that the cleanup_operator_resources function is defined and does include the commands to delete the operator pod, but this function is not actually used anywhere.

function cleanup_operator_resources {
  delete_operator
  cleanup_resources
  wait_clean
}

fio-d: roles should include commands to drop caches

fio best practice is (generally) to sync and drop caches with every iteration/sample. This is usually done with:

sync; echo 3 > /proc/sys/vm/drop_caches

We could add this as a bash command to each test sequence, or we could include it directly in the fio job file with:

exec_prerun=sync; echo 3 > /proc/sys/vm/drop_caches

Might be worthwhile to make this optional in the CR for the off chance someone wants to test without this.

RFE: Collect cluster metadata before triggering a workload

We should look at running stockpile/ other tool to collect cluster metadata before we trigger the workload, instead of having a harness run stockpile/other_tool to collect data, I'd propose that we trigger the playbook/role from within the operator itself.
Possible cons of this would be:

  • Longer runs for the operator, as it'll have to finish collecting metadata.
  • Requires stockpile/other.tool to be closely integrated with the operator.
  • Also require processing/indexing tool to be able to parse the benchmark results with in the operator logic and index the results.

fio-d: Implement servers as a list to loop through

Pushing fio to find system limits usually involves ramping up the parallelism of worker jobs. You can do this via threads per worker (which should also be implemented as a list; separate issue), but it is often valuable, particularly for a distributed storage SUT, to increase the total number of workers in order to get past any per-worker bottlenecks and properly saturate the storage system.

Implementing the servers key in the CR as a list would allow us to iterate through server counts as an outer loop to samples. This is a bit complicated with the current implementation of the fio-d role as we first run an ansible task to spin up the number of workers as fio server pods before executing the fio job task. So we would need to be able to spin up servers, run a job, increment the number of servers, and then run the subsequent job from the server list.

One option is to use the highest value from the servers list and spin up a number of server pods once equal to that maximum value, that way only the fio job task needs to be looped. The downside is that the server pods unused for the lower-value jobs will still be consuming resources in the k8s system and could potentially skew the results.

Another option may be to implement this loop at the playbook.yaml level, similar to the original proposed method for implementing the samples feature.

Discussion: pin_server vs pin_node or other key name in CR

We currently use the key pin_server across a few roles to identify the k8s node to which a pod or pods will be pinned with a nodeSelector in the template. I think the name of this key is a little confusing -- Since the value you pass to pin_server is a k8s node name to which you are pinning, on the surface I would argue that it should be called pin_node instead (which is what I used in the pgebench role).

However, in the uperf and iperf roles, there is also a pin_client key, so it is more clear that the key name is in the context of what you are pinning, not where you are pinning it.

The keys might be more clear in all conditions as something like pin_server_to_node and pin_client_to_node, but perhaps I'm overthinking it and we should just address this in documentation. To me, it is valuable to make the usage of the CR structure as self-explanatory as possible in order to enhance usability, which will better drive adoption and limit the pings we get to answer howto questions.

fio-d: List of small issues/enhancements

  • PVCs should have unique names between runs. Currently these are claim-{{ item }}, which can at the very least cause some annoyances if deleting and then re-applying the CR, as the operator will trigger creating of a PVC with the same name as one that is still being deleted.

  • Implement size- vs. time-based runs. Currently the CR and templates allow for a time-based run in the jobfile only. A nice-to-have feature would be an option to run based on file size.

    • PR #205 -- Decision was to implement all writes a size-based and all reads as time-based.
  • Workers/threads should be a list to loop through.

  • File size should be a list to loop through.

    • Cancelling as it's probably not important anymore
  • Merge jobfile templates into one. I don't see a good reason to have multiple templates. We should simply be adjusting values in the templates based on the type of job and the specific parameters provided in the CR.

  • Provide jobname in a better way. Seems like a weird thing to name explicitly as a value in the CR. Maybe jobname can just match job in the CR.

  • Merge fio and fio-d into one role.

  • Implement some way of balancing server pods across nodes.

  • Need to consider how a RWX PV would be tested.

  • Git rid of redundant pin boolean. Simply providing a pin host or not should be sufficient. Also... Does pinning even make sense in a distributed test? The way this is implemented means all server pods would go to the single pin host.

  • The filesize in the CR should be passed with the scale, just as the fio command would accept it (i.e., 2g instead of 2)

  • The jobname in the CR is redundant to the job and can be removed

Collect and display more detail for CI

  • - CI should move on running workloads even if one fails
  • - CI should display workloads that passed and the ones that failed
  • - Collect logs of a particular run through tmp/ansible-operator/runner/benchmark.example.com/v1alpha1/Benchmark/ripsaw/example-benchmark/artifacts/latest/stdout file in pod
  • - Upload these logs somewhere so user can access them later

Propose move to gerrithub workflow

Github workflow has the following issues:

  • Working on other users PR has been significantly difficult leading to PR cycle
  • Another major issue is the need for squashing commits.
  • There is no real way to look for differences between commits if they were squashed or amended.

I propose that we move to Gerrithub workflow as it addresses above mentioned problem and also provides additional benefits such as

  • Neat UI to facilitate voting on patchsets/reviews submitted.
  • Better integration with Jenkins.
  • Even possible to run jobs that get triggered only on merge. This would be crucial in building operator images and pushing to quay.

The only backdrop I see is that some of us will be needing to adopt to new workflow.

Simplify repo structure and installation process

I still feel like we have some structural issues that make ripsaw harder to consume than it should be. Exactly what the best practices are to define and structure operator resources seems to be up for debate, but I would recommend we do something similar to rook.io where the operator and all of its k8s resource dependencies are deployed via 2 yaml files -- common.yaml (which houses the namespace definition, CRD, and all RBAC) and operator.yaml. I might even suggest that we simply converge all of this into one operator.yaml file.

I think it is also a bit confusing the way the CR files are named and placed in the structure. I would suggest creating an examples/ directory at the repo root and simplifying the naming convention of the files to just <workload_name>.yaml, <infra_name>.yaml, and <workload_name>-<infra_name>.yaml, as appropriate.

Mesh mode for uperf

A baseline test that I've commonly used is to fully saturate the network connections between nodes in order to reveal any bottlenecks that may affect higher-layer tests. I'd like to see us add a "mesh" mode to the uperf test in which we intelligently determine the number of scheduleable workers and run tests between all nodes simultaneously.

Multiple UPerf tests w/ single CR

We want to have a single UPerf CR that iterates through multiple UPerf scenarios, for example:

apiVersion: benchmark.example.com/v1alpha1
kind: Benchmark
metadata:
  name: uperf-benchmark
  namespace: ripsaw
spec:
  workload:
    # cleanup: true
    name: uperf
    args:
      hostnetwork: true 
      pin: true 
      pin_server: "master-0"
      pin_client: "master-1"
      rerun: 1 
      pair: 1
      protos: 
        - tcp
      test_type: stream
      nthr: 2
      sizes: 
        - 16384
        - 1024
      runtime: 30

The above test will iterate through 16384 and 1024 message sizes.

There are two approaches I see, one of which I have implemented and tested.

First Option

Build UPerf XML to iterate through the tests.

However, the output isn't obvious, meaning:

TX worklist success  Sent workorder
Handshake phase 2 with 192.168.111.20 done
Completed handshake phase 2
Starting 2 threads running profile:ripsaw-test ...   0.00 seconds
TX command [UPERF_CMD_NEXT_TXN, 0] to 192.168.111.20
Txn1          0 /   0.00(s) =            0           0op/s 
Txn1          0 /   1.00(s) =            0           2op/s 

TX command [UPERF_CMD_NEXT_TXN, 1] to 192.168.111.20
Txn2          0 /   0.00(s) =            0           0op/s 
Txn2     2.75GB /   1.00(s) =    23.57Gb/s      179816op/s 
Txn2     5.49GB /   2.00(s) =    23.55Gb/s      179686op/s 
Txn2     8.23GB /   3.00(s) =    23.54Gb/s      179623op/s 
Txn2    10.97GB /   4.00(s) =    23.54Gb/s      179597op/s 
Txn2    13.71GB /   5.00(s) =    23.54Gb/s      179592op/s 
Txn2    16.46GB /   6.01(s) =    23.54Gb/s      179575op/s 
Txn2    19.20GB /   7.01(s) =    23.54Gb/s      179566op/s 
Txn2    21.94GB /   8.01(s) =    23.53Gb/s      179551op/s 
Txn2    24.68GB /   9.01(s) =    23.53Gb/s      179549op/s 
Txn2    27.39GB /  10.01(s) =    23.50Gb/s      179310op/s 
Txn2    30.13GB /  11.01(s) =    23.51Gb/s      179335op/s 
Txn2    32.87GB /  12.01(s) =    23.51Gb/s      179357op/s 
Txn2    35.61GB /  13.01(s) =    23.51Gb/s      179362op/s 
Txn2    38.36GB /  14.01(s) =    23.51Gb/s      179383op/s 
Txn2    41.10GB /  15.01(s) =    23.51Gb/s      179392op/s 
Txn2    43.83GB /  16.02(s) =    23.51Gb/s      179359op/s 
Txn2    46.58GB /  17.02(s) =    23.51Gb/s      179373op/s 
Txn2    49.32GB /  18.02(s) =    23.51Gb/s      179378op/s 
Txn2    52.06GB /  19.02(s) =    23.51Gb/s      179391op/s 
Txn2    54.80GB /  20.02(s) =    23.51Gb/s      179397op/s 
Txn2    57.54GB /  21.02(s) =    23.51Gb/s      179400op/s 
Txn2    60.28GB /  22.02(s) =    23.52Gb/s      179409op/s 
Txn2    63.03GB /  23.02(s) =    23.52Gb/s      179417op/s 
Txn2    65.77GB /  24.02(s) =    23.52Gb/s      179418op/s 
Txn2    68.51GB /  25.02(s) =    23.52Gb/s      179424op/s 
Txn2    71.25GB /  26.02(s) =    23.52Gb/s      179429op/s 
Txn2    73.99GB /  27.03(s) =    23.52Gb/s      179428op/s 
Txn2    76.70GB /  28.03(s) =    23.51Gb/s      179357op/s 
Txn2    79.45GB /  29.03(s) =    23.51Gb/s      179365op/s 
Sending signal SIGUSR2 to 140611952076544
Sending signal SIGUSR2 to 140611943683840
called out
Txn2    82.19GB /  30.23(s) =    23.35Gb/s      178180op/s 

TX command [UPERF_CMD_NEXT_TXN, 2] to 192.168.111.20
Txn3          0 /   0.00(s) =            0           0op/s 
Txn3          0 /   1.00(s) =            0           2op/s 

TX command [UPERF_CMD_NEXT_TXN, 3] to 192.168.111.20
Txn4          0 /   0.00(s) =            0           0op/s 
Txn4          0 /   1.00(s) =            0           2op/s 

TX command [UPERF_CMD_NEXT_TXN, 4] to 192.168.111.20
Txn5          0 /   0.00(s) =            0           0op/s 
Txn5     1.61GB /   1.00(s) =    13.80Gb/s     1685114op/s 
Txn5     3.23GB /   2.00(s) =    13.85Gb/s     1690938op/s 
Txn5     4.81GB /   3.00(s) =    13.77Gb/s     1681184op/s 
Txn5     6.47GB /   4.00(s) =    13.88Gb/s     1694944op/s 
Txn5     8.09GB /   5.00(s) =    13.89Gb/s     1695700op/s 
Txn5     9.73GB /   6.01(s) =    13.91Gb/s     1698501op/s 
Txn5    11.32GB /   7.01(s) =    13.88Gb/s     1694143op/s 
Txn5    12.90GB /   8.01(s) =    13.84Gb/s     1689175op/s 
Txn5    14.52GB /   9.01(s) =    13.85Gb/s     1690120op/s 
Txn5    16.14GB /  10.01(s) =    13.85Gb/s     1690853op/s 
Txn5    17.76GB /  11.01(s) =    13.86Gb/s     1691405op/s 
Txn5    19.41GB /  12.01(s) =    13.88Gb/s     1694451op/s 
Txn5    21.04GB /  13.01(s) =    13.89Gb/s     1695563op/s 
Txn5    22.67GB /  14.01(s) =    13.90Gb/s     1696544op/s 
Txn5    24.28GB /  15.01(s) =    13.89Gb/s     1695637op/s 
Txn5    25.91GB /  16.02(s) =    13.89Gb/s     1696059op/s 
Txn5    27.53GB /  17.02(s) =    13.90Gb/s     1696469op/s 
Txn5    29.18GB /  18.02(s) =    13.91Gb/s     1698319op/s 
Txn5    30.82GB /  19.02(s) =    13.92Gb/s     1699048op/s 
Txn5    32.46GB /  20.02(s) =    13.93Gb/s     1700361op/s 
Txn5    34.10GB /  21.02(s) =    13.93Gb/s     1700785op/s 
Txn5    35.68GB /  22.02(s) =    13.92Gb/s     1699024op/s 
Txn5    37.29GB /  23.02(s) =    13.91Gb/s     1698154op/s 
Txn5    38.93GB /  24.02(s) =    13.92Gb/s     1699317op/s 
Txn5    40.57GB /  25.03(s) =    13.93Gb/s     1699966op/s 
Txn5    42.21GB /  26.03(s) =    13.93Gb/s     1700781op/s 
Txn5    43.86GB /  27.03(s) =    13.94Gb/s     1701440op/s 
Txn5    45.46GB /  28.03(s) =    13.93Gb/s     1700898op/s 
Txn5    47.10GB /  29.03(s) =    13.94Gb/s     1701272op/s 
Sending signal SIGUSR2 to 140611952076544
Sending signal SIGUSR2 to 140611943683840

It isn't obvious to the reader which test is what -- (I know what is what, but to a user, this isn't too obvious).

Option 2

Instead of having UPerf iterate through the tests, create a unique client for each workload.

I personally am preferring Option 2, but want to get feedback before I go down the route of implementing this...

RFE: Central collection point for results

Add a feature to tag workload results and upload them to a central repository (likely an object store). Could be configurable as a private repo, or could default to a public one where we can collect broad result sets for analysis.

Do we need a Security Context Constraints for benchmark-operator/ripsaw ?

While I was working on small-file operator, I faced a lot of issues due to non-root privileges container, so , the solution we could rule out was to create an SCC and associated SA , then allowing the pods to run as root. But the difference was, I was not using a PVC to develop the smallfile operator. I don't think that when we will be working with PVC mounted on a mountpath, we could really see that issue. Hence, I am creating this issue to have discussion and reach to a majority based opinion.

Investigate YCSB default Benchmark Settings

Hi,

Joe asked me to create an issue for this. Many of the default YCSB benchmark configurations use a zipfian distribution for reads which will greatly favor reads from cache rather than from disk. It would be worth creating several additional benchmark configurations that also test random read distributions in addition to zipfian.

Be opinionated about our namespace

We are facing a few issues related to namespaces and contexts. Ultimately, it makes sense for all resources created by the benchmark operator to exist in a single namespace. It seems that other operator projects are explicit about this -- defining the namespace directly in their deployment and config files.

We have begun hard-coding the 'benchmark' namespace in deploy files, but not for the operator.yaml file (so currently the operator itself will deploy to $current_context), and we rely on {{ meta.namespace }} across the roles.

It seems sub-optimal to scatter-shot the hard-coding of the 'benchmark' namespace across the many files where this context needs to be set. What other options do we have?

Discussion: Creation/handling of additional resources required

There can be cases where the infra/workloads would require to create additional resources which could be something like RBAC and since it's not required for the operator itself, It wouldn't make sense to mandate user to create them. So thus there are 2 ways to approach this problem:

  • Request user to create the necessary resources after deploying operator and before applying the cr. This would put the burden of managing the resources lifecycle on the user, and thus would not require the operator to do anything more. The issue with this is relying on user to make the appropriate changes and also that it'll require more user involvement.
  • Have operator create the resources on the go i.e. if user applies a cr that requires a workload/infra to have additional resources such as RBAC it'll create on the fly. The issue with this is that it requires the operator to delete these RBAC based on conditions( requiring significantly more logic). But if the user somehow manages to delete the operator, then that'd be futile.

What do you think is the best way to move forward?

fio result structure and meaning

A discussion with Alex Calhoun got me thinking about fio-result.json produced by ripsaw fio-bench, specifically the "All clients" section. Does this section mean anything? All fio-benchmarks I've used in the past has just given you throughput numbers that were the sum of the individual fio processes (pods in your case) running at the same time on the same workload. But ripsaw is running fio jobs serially (one at a time) for rw={read,write,randread,randwrite} rather than in parallel. So how can ripsaw get a valid result by adding up throughputs for jobs running at different times? I don't think it can. Also the latency numbers in there are not useful because you can't calculate system-wide percentiles from per-process percentiles.

We could separate out the JSON elements for different workloads (run at different times) and aggregate those results in a meaningful way. Perhaps it would be easier to just run each fio workload as a separate fio client job, with output files in a separate subdirectory. As a result, "All clients" now has meaning because all fio pods in fio-result.json were running at the same time on the same workload, so throughput aggregation now makes sense. The separation of workload results into separate directories would make it easier for anything/anyone else that is analyzing those results. For example, they could be easily worked on by ACT (Alex's elastic search injector) or a browbench scribe, or whatever equivalent data analysis tool is being used, in parallel with the test run. Each directory would have the exact parameters and inputs fed to fio, + all the fio logs and JSON that was output by fio as a result. This is similar to what pbench-fio does in its directory structure also.

Finally, as both CBT and pbench-fio do, we could add a layer to this directory structure for multiple samples of the same workload (combination of fio parameters) so that %deviation can be calculated from it. This allows for ripsaw to be extended to run multiple samples of the same data point, which would be vital for demonstrating that results are accurate. At this point, ripsaw output would become much more similar to CBT and pbench-fio in directory structure and analysis code overlap could be increased. For example, pbench-fio has a directory structure that indicates workload at top and sample at the next level, such as:

/var/lib/pbench-agent/fio__2019.05.24T14.25.38/1-read-4KiB/sample1/

All this would lead to a natural reuse of existing analysis tools, such as grafana dashboards, with ripsaw. HTH -ben

uperf workload should default to testing between nodes

Right now, unless you explicitly pin the clients and servers, uperf may (and often does) schedule the client and server component pods onto the same host. This leads to network testing only of the loopback of the host, which I do not think is generally useful.

I believe we should implement anti-affinity for the associated client and server pods so that by default they run on different nodes and therefore test real network connections.

RFE: Manage our CR

I think we should manage our own CR, allowing us to set our own status with k8s_status... Different states that I think make sense

1. Staging - Infra being built
2. Init - Workload is loading data (optional)
3. Running - Workload is running
4. Complete - Workload completed
5. (Optional) Failed - Workload failed (possible to have multiple failed states to help debug.  

RFE: Allow full flexibility to run fio jobs

With the fio jobfile templated in roles/fio-bench/templates/job.fio.j2, we either have to restrict fio parameters to a subset that we are prepared to support, or we have to template out every possible fio parameter.

In order to support full flexibility of fio jobs, it might make more sense to pull the job file out as something that is user-provided as part of the CR or an include of the CR. Off hand, it seems like there would still need to be an amount of templating, so I'm not immediately sure how to import a raw standard-formatted fio jobfile.

Another option proposed is to split fio into two different types/roles -- One would be a basic "fio-simple-bench" in which the jobfile would remain templated in the role with limited parameters provided to the user to adjust in the CR. The second would be a "fio-user-bench" in which the fio jobfile would be provided as part of the CR outside of the role.

Add figlet to CI systems

Just a simple hack to help with CI log output readability. I've added to each test script:

figlet $(basename $0)

Which will output something like this at the beginning of each script:

 _            _       _                         _       _     
| |_ ___  ___| |_    | |__  _   _  _____      _| |  ___| |__  
| __/ _ \/ __| __|   | '_ \| | | |/ _ \ \ /\ / / | / __| '_ \ 
| ||  __/\__ \ |_    | |_) | |_| | (_) \ V  V /| |_\__ \ | | |
 \__\___||___/\__|___|_.__/ \__, |\___/ \_/\_/ |_(_)___/_| |_|
                |_____|     |___/

But these commands are disabled right now because figlet is not available on the CI systems.

RFE : New CR / Playbook structure

Right now our Custom resource looks like

spec:
  uperf:
    # To disable uperf, set pairs to 0
    pair: 1
    proto: tcp
    test_type: stream
    nthr: 2
    size: 16384
    runtime: 10
  fio:
    # To disable fio, set clients to 0
    clients: 0
    jobname: test-write
    bs: 4k
    iodepth: 4
    runtime: 57
    rw: write
    filesize: 1

We only have a limited definition. I would like to move to a list

spec:
    infra:
      couchbase:
        servers: 1
    workloads:
      - uperf:
        # To disable uperf, set pairs to 0
        pair: 1
        ...
     - fio:
       # To disable fio, set clients to 0
       clients: 0
       jobname: test-write
     - ycsb:
       ...

Which would allow us to have a pipeline of workloads that we iterate through.

Also, defining the infrastructure (ie databases we will run against) allows us to kick off infrastructure and run many different workloads against it, vs just always relying on YCSB...

Sanity check of the CR applied

Identify incorrect/invalid CRs or essentially just validate the CR parameters, before starting to run the benchmark and hitting errors. An example of this can be, if a user provides a storage class that's on available on the cluster, then an error message should be displayed without entering logic of the particular benchmark instead of hitting errors while creating pods/pv needed for that benchmark.

RFE : Jinja template workloads

Right now we are defining the job/pods in the tasks.

We should consider jinja templates for a two reasons

  1. Customizations passed by users (ie, Pass a PV or not)
  2. Virtual Machines are treated differently

Add creation of minishift/minikube environment for CI

Per aakarshg I've rolled together a simple setup of the minikube/shift environments for CI. This gets triggered near the start of the test.sh script using blackknight's setup/install playbooks depending on the NODE_LABELS.

Current work is here: https://github.com/dry923/ripsaw/blob/miniinstall/tests/start_mini.sh

A few outstanding questions:

  • Should we always install fresh or just see if it is not installed/running?
  • Like ^ should we uninstall upon completion?
  • If we find a running minishift/kube should we kill it and redeploy, leave it and continue or error?
  • Do we need to tweak any of the default cpu/mem/storage from blackknights install defaults.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.