Coder Social home page Coder Social logo

ib-sriov-cni's People

Contributors

adrianchiris avatar almaslennikov avatar bn222 avatar dependabot[bot] avatar dmytrolinkin avatar e0ne avatar lgtm-com[bot] avatar mmduh-483 avatar moshe010 avatar rollandf avatar schseba avatar zeeke avatar zshi-redhat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ib-sriov-cni's Issues

GUID not restored to "F"s. on pod deletion.

Current cmdDel behavior is to restore VF's GUID from cache. For some reason GUID isn't restored and for further Adds and Dels setted by ib-sriov-cni GUID becomes "cached".

Release ib-sriov-cni v1.0.3

There is a fix merged since the last release that we would like to incorporate in the upcoming network-operator release.

So, requesting to make a new release of ib-sriov-cni

deprecate "guid" CNI Arg

The process of standardizing infiniband GUID as CNI runtime config parameter has been complete.

  • Relevant Specs where updated (CNI, mult-net-spec)
  • Support was added in relevant projects

PR Mellanox/ib-kubernetes#70 was the last missing piece.
We should consider deprecating "guid" CNI config so that only infinibandGUID CNI runtime config is used.

Work items:

  • Update documentation - mark guid CNI config as deprecated
  • Remove related code after <enter X time units here>

Note: multus-cni supports infiniband-guid network-attachment attribute as of V3.6

enable ib-sriov-cni for native IB device

If RDMA application is written to communicate via native IB device instead of IPoIB, should we enable ib-sriov-cni in such case to configure VF GUID? If not, what will be the recommended way for VF GUID configuration?

LICENCE update

I noticed there was a branch called licence, any plan to merge the licence change to master branch?

mellanox SRIOV demo pod cannot be created

I tried to create a pod with SRIOV net device (e.g. Mellanox IB), but the pod stuck in ContainerCreating. I configured 4 VFs on the IB interface of the host. I run device plugin pod and Multus CNI meta-plugin. but the SRIOV demo pod show ERROR

multus

./multus-daemonset-thick-plugin.yml:125: image: ghcr.io/k8snetworkplumbingwg/multus-cni:v3.9.2-thick-amd64

ERROR

n-MacBookPro:~/20-k8s-rdma-sriov/ib-sriov-cni/deployment/examples$ kubectl describe po my-test-pod-fnjk7
Name:         my-test-pod-fnjk7
Namespace:    default
Priority:     0
Node:         s-113-2-35/10.113.2.35
Start Time:   Tue, 22 Nov 2022 20:22:33 +0800
Labels:       <none>
Annotations:  cni.projectcalico.org/containerID: 848157aeb2b3549aa8e2fce419c8353989ecb98ad62b1c6513f46423492f6cfd
              cni.projectcalico.org/podIP:
              cni.projectcalico.org/podIPs:
              k8s.v1.cni.cncf.io/networks: [{"name": "ib-sriov-network"}]
Status:       Pending
IP:
IPs:          <none>
Containers:
  my-test-ctr:
    Container ID:
    Image:         mellanox/rping-test
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sh
      -c
      sleep 1000000

    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      mellanox.com/mlnx_sriov_rdma_ib:  1
    Requests:
      mellanox.com/mlnx_sriov_rdma_ib:  1
    Environment:                        <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2clfq (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-2clfq:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason                  Age                From               Message
  ----     ------                  ----               ----               -------
  Normal   Scheduled               21s                default-scheduler  Successfully assigned default/my-test-pod-fnjk7 to s-113-2-35
  Normal   AddedInterface          21s                multus             Add eth0 [10.42.0.21/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  21s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "ef4b067661534edfacd217cb1ea3cb1b2cdd44f65ffc1067a59091a2ae6490be" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "ef4b067661534edfacd217cb1ea3cb1b2cdd44f65ffc1067a59091a2ae6490be" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name ef4b067661534edfacd217cb1ea3cb1b2cdd44f65ffc1067a59091a2ae6490be-net1]
  Normal   AddedInterface          20s                multus             Add eth0 [10.42.0.22/32] from k8s-pod-network
  Normal   AddedInterface          19s                multus             Add eth0 [10.42.0.23/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  19s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "3573ba2407bcf6bacb171e5e8b32980ff549a59de1bd8b119d89f6304ae69b7c" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "3573ba2407bcf6bacb171e5e8b32980ff549a59de1bd8b119d89f6304ae69b7c" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 3573ba2407bcf6bacb171e5e8b32980ff549a59de1bd8b119d89f6304ae69b7c-net1]
  Warning  FailedCreatePodSandBox  18s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "0cdbf8cb322a3156d88f04a52c2bea0fc51511ffa6d21b4db9aa4ae44dc858e2" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "0cdbf8cb322a3156d88f04a52c2bea0fc51511ffa6d21b4db9aa4ae44dc858e2" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 0cdbf8cb322a3156d88f04a52c2bea0fc51511ffa6d21b4db9aa4ae44dc858e2-net1]
  Normal   AddedInterface          18s                multus             Add eth0 [10.42.0.24/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  17s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "93bbd85125dc93d15558f34aa2693d13781db6d38905925814151160ef405dc9" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "93bbd85125dc93d15558f34aa2693d13781db6d38905925814151160ef405dc9" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 93bbd85125dc93d15558f34aa2693d13781db6d38905925814151160ef405dc9-net1]
  Normal   AddedInterface          17s                multus             Add eth0 [10.42.0.25/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  16s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "8ea7b9cda5014ae0e8a3f335903e83c542156c4ec8de84c80a627ef3c3473cb1" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "8ea7b9cda5014ae0e8a3f335903e83c542156c4ec8de84c80a627ef3c3473cb1" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 8ea7b9cda5014ae0e8a3f335903e83c542156c4ec8de84c80a627ef3c3473cb1-net1]
  Normal   AddedInterface          16s                multus             Add eth0 [10.42.0.26/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  15s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "922f59df03433b78b31201f685867ac475fcb96c5b4791eecd642fe87b5ae365" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "922f59df03433b78b31201f685867ac475fcb96c5b4791eecd642fe87b5ae365" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 922f59df03433b78b31201f685867ac475fcb96c5b4791eecd642fe87b5ae365-net1]
  Normal   AddedInterface          15s                multus             Add eth0 [10.42.0.27/32] from k8s-pod-network
  Normal   AddedInterface          14s                multus             Add eth0 [10.42.0.28/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  14s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "68c5c26e73706571b562dfa035e6b53e848f7cc18c85b8a3995f0a2a3c338b97" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "68c5c26e73706571b562dfa035e6b53e848f7cc18c85b8a3995f0a2a3c338b97" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 68c5c26e73706571b562dfa035e6b53e848f7cc18c85b8a3995f0a2a3c338b97-net1]
  Warning  FailedCreatePodSandBox  13s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "60fda0e94bf41698460e2406a00d6443299a9b176da7ed8004f39adfc2bb16e0" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "60fda0e94bf41698460e2406a00d6443299a9b176da7ed8004f39adfc2bb16e0" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 60fda0e94bf41698460e2406a00d6443299a9b176da7ed8004f39adfc2bb16e0-net1]
  Normal   AddedInterface          12s                multus             Add eth0 [10.42.0.29/32] from k8s-pod-network
  Warning  FailedCreatePodSandBox  12s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "777d1178ca6d8681b1f0f43780fb357c0dce74a6905c94337c2f07ef9a5c9c36" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "777d1178ca6d8681b1f0f43780fb357c0dce74a6905c94337c2f07ef9a5c9c36" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 777d1178ca6d8681b1f0f43780fb357c0dce74a6905c94337c2f07ef9a5c9c36-net1]
  Normal   AddedInterface          11s                multus             Add eth0 [10.42.0.30/32] from k8s-pod-network

The device plugin can detect the SRIOV net device on the host (node s-113-2-35 in my experiment), the output is shown in the following:

-MacBookPro:~/20-k8s-rdma-sriov/multus-cni/deployments$ kubectl get node s-113-2-35 -o json | jq '.status.allocatable'
{
  "cpu": "128",
  "ephemeral-storage": "5169411933432",
  "hugepages-1Gi": "0",
  "hugepages-2Mi": "0",
  "mellanox.com/mlnx_sriov_rdma_ib": "4",
  "memory": "528110968Ki",
  "pods": "110"
}

NAD

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: ib-sriov-network
  annotations:
    k8s.v1.cni.cncf.io/resourceName: mellanox.com/mlnx_sriov_rdma_ib
spec:
  config: '{
  "type": "ib-sriov",
  "cniVersion": "0.3.1",
  "name": "sriov-network",
  "ipam": {
    "type": "host-local",
    "subnet": "192.168.217.0/24",
    "routes": [{
      "dst": "0.0.0.0/0"
    }],
    "gateway": "192.168.217.1"
  }
}'

mutlus configmap

apiVersion: v1
kind: ConfigMap
metadata:
  name: sriovdp-config
  namespace: kube-system
data:
  config.json: |
    {
        "resourceList": [{
                "resourcePrefix": "mellanox.com",
                "resourceName": "mlnx_sriov_rdma_ib",
                "selectors": {
                    "isRdma": true,
                    "vendors": ["15b3"],
                    "devices": ["101c"],
                    "drivers": ["mlx5_core"]
                }
            }
        ]
    }

sriov device plugin

n-MacBookPro:~/20-k8s-rdma-sriov/multus-cni/deployments$ kubectl -n kube-system logs kube-sriov-device-plugin-amd64-bpwlk
I1122 11:59:59.507695       1 manager.go:51] Using Kubelet Plugin Registry Mode
I1122 11:59:59.508691       1 main.go:44] resource manager reading configs
I1122 11:59:59.508739       1 manager.go:79] raw ResourceList: {
    "resourceList": [{
            "resourcePrefix": "mellanox.com",
            "resourceName": "mlnx_sriov_rdma_ib",
            "selectors": {
                "isRdma": true,
                "vendors": ["15b3"],
                "devices": ["101c"],
                "drivers": ["mlx5_core"]
            }
        }
    ]
}
I1122 11:59:59.508875       1 factory.go:166] net device selector for resource mlnx_sriov_rdma_ib is &{DeviceSelectors:{Vendors:[15b3] Devices:[101c] Drivers:[mlx5_core] PciAddresses:[]} PfNames:[] RootDevices:[] LinkTypes:[] DDPProfiles:[] IsRdma:true NeedVhostNet:false}
I1122 11:59:59.508902       1 manager.go:99] unmarshalled ResourceList: [{ResourcePrefix:mellanox.com ResourceName:mlnx_sriov_rdma_ib DeviceType:netDevice Selectors:0xc00000cd38 SelectorObj:0xc000375380}]
I1122 11:59:59.508960       1 manager.go:200] validating resource name "mellanox.com/mlnx_sriov_rdma_ib"
I1122 11:59:59.508968       1 main.go:60] Discovering host devices
I1122 11:59:59.589424       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c2:00.0 02              Intel Corporation    Ethernet Controller X710 for 10GbE SFP+
I1122 11:59:59.589938       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c2:00.1 02              Intel Corporation    Ethernet Controller X710 for 10GbE SFP+
I1122 11:59:59.590256       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.0 02              Mellanox Technolo... MT28908 Family [ConnectX-6]
I1122 11:59:59.591462       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.1 02              Mellanox Technolo... MT28908 Family [ConnectX-6]
I1122 11:59:59.591704       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.2 02              Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.591894       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.3 02              Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.592053       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.4 02              Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.592203       1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.5 02              Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.592383       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:01:00.0     12              unknown              unknown
I1122 11:59:59.592392       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:22:00.0     12              unknown              unknown
I1122 11:59:59.592397       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:41:00.0     12              unknown              unknown
I1122 11:59:59.592403       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:61:00.0     12              unknown              unknown
I1122 11:59:59.592407       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:81:00.0     12              unknown              unknown
I1122 11:59:59.592412       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:a1:00.0     12              unknown              unknown
I1122 11:59:59.592417       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:c1:00.0     12              unknown              unknown
I1122 11:59:59.592421       1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:e1:00.0     12              unknown              unknown
I1122 11:59:59.592429       1 main.go:66] Initializing resource servers
I1122 11:59:59.592731       1 manager.go:105] number of config: 1
I1122 11:59:59.592739       1 manager.go:109]
I1122 11:59:59.592742       1 manager.go:110] Creating new ResourcePool: mlnx_sriov_rdma_ib
I1122 11:59:59.592746       1 manager.go:111] DeviceType: netDevice
W1122 11:59:59.592779       1 pciNetDevice.go:55] RDMA resources for 0000:c2:00.0 not found. Are RDMA modules loaded?
I1122 11:59:59.593104       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c2:00.0. error getting devlink device attributes for net device 0000:c2:00.0 no such device
W1122 11:59:59.593215       1 pciNetDevice.go:55] RDMA resources for 0000:c2:00.1 not found. Are RDMA modules loaded?
I1122 11:59:59.593362       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c2:00.1. error getting devlink device attributes for net device 0000:c2:00.1 no such device
I1122 11:59:59.594005       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.1. <nil>
I1122 11:59:59.596385       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.2. <nil>
I1122 11:59:59.597465       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.3. <nil>
I1122 11:59:59.598273       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.4. <nil>
I1122 11:59:59.599262       1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.5. <nil>
I1122 11:59:59.599408       1 factory.go:106] device added: [pciAddr: 0000:c3:00.2, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599417       1 factory.go:106] device added: [pciAddr: 0000:c3:00.3, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599423       1 factory.go:106] device added: [pciAddr: 0000:c3:00.4, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599428       1 factory.go:106] device added: [pciAddr: 0000:c3:00.5, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599446       1 manager.go:139] New resource server is created for mlnx_sriov_rdma_ib ResourcePool
I1122 11:59:59.599454       1 main.go:72] Starting all servers...
I1122 11:59:59.599885       1 server.go:199] starting mlnx_sriov_rdma_ib device plugin endpoint at: mellanox.com_mlnx_sriov_rdma_ib.sock
I1122 11:59:59.602783       1 server.go:226] mlnx_sriov_rdma_ib device plugin endpoint started serving
I1122 11:59:59.602805       1 main.go:77] All servers started.
I1122 11:59:59.602811       1 main.go:78] Listening for term signals
I1122 12:00:00.175755       1 server.go:110] Plugin: mellanox.com_mlnx_sriov_rdma_ib.sock gets registered successfully at Kubelet
I1122 12:00:00.175875       1 server.go:134] ListAndWatch(mlnx_sriov_rdma_ib) invoked
I1122 12:00:00.175890       1 server.go:142] ListAndWatch(mlnx_sriov_rdma_ib): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:c3:00.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:c3:00.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:c3:00.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:c3:00.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},},}
I1122 12:04:42.983933       1 server.go:119] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:c3:00.3],},},}
I1122 12:04:42.984024       1 netResourcePool.go:51] GetDeviceSpecs(): for devices: [0000:c3:00.3]
I1122 12:04:42.984044       1 pool_stub.go:97] GetEnvs(): for devices: [0000:c3:00.3]
I1122 12:04:42.984052       1 pool_stub.go:113] GetMounts(): for devices: [0000:c3:00.3]
I1122 12:04:42.984059       1 server.go:128] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_MELLANOX_COM_MLNX_SRIOV_RDMA_IB: 0000:c3:00.3,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/infiniband/issm3,HostPath:/dev/infiniband/issm3,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/umad3,HostPath:/dev/infiniband/umad3,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/uverbs3,HostPath:/dev/infiniband/uverbs3,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/rdma_cm,HostPath:/dev/infiniband/rdma_cm,Permissions:rwm,},},Annotations:map[string]string{},},},}
I1122 12:22:33.340229       1 server.go:119] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:c3:00.4],},},}
I1122 12:22:33.340326       1 netResourcePool.go:51] GetDeviceSpecs(): for devices: [0000:c3:00.4]
I1122 12:22:33.340347       1 pool_stub.go:97] GetEnvs(): for devices: [0000:c3:00.4]
I1122 12:22:33.340355       1 pool_stub.go:113] GetMounts(): for devices: [0000:c3:00.4]
I1122 12:22:33.340362       1 server.go:128] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_MELLANOX_COM_MLNX_SRIOV_RDMA_IB: 0000:c3:00.4,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/infiniband/issm4,HostPath:/dev/infiniband/issm4,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/umad4,HostPath:/dev/infiniband/umad4,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/uverbs4,HostPath:/dev/infiniband/uverbs4,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/rdma_cm,HostPath:/dev/infiniband/rdma_cm,Permissions:rwm,},},Annotations:map[string]string{},},},}

decouple ib-sriov-cni with ib-kubernetes

ib-sriov-cni shall be able to run without ib-kubernetes be deployed for simple case such as InfiniBand network in default partition.

Currently ib-sriov-cni checks for existence of mellanox.infiniband.app annotation in CNI-Args and returns immediately if not annotated. This assumes ib-kubernetes be deployed when using ib-sriov-cni, which is not always the case.

Is VFIO supported?

Hi,
I tried to use QEMU VFIO to assign the VFs of the IB interface to a VM in KubeVirt. But the VFIO resource cannot be detected or advertised in kubevirt. I am not sure if the ib-sriov-cni-deamonset supports VFIO resource? any comment and suggestion are appreciated!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.