k8snetworkplumbingwg / ib-sriov-cni Goto Github PK
View Code? Open in Web Editor NEWInfiniBand SR-IOV CNI
License: Other
InfiniBand SR-IOV CNI
License: Other
Current cmdDel behavior is to restore VF's GUID from cache. For some reason GUID isn't restored and for further Adds and Dels setted by ib-sriov-cni GUID becomes "cached".
There is a security fix merged since the last release . It would be good to make a new release to use it in SR-IOV Network Operator
Let's make GCR our official registry for IB SR-IOV CNI images like we've got for other projects inside K8S Network Plumbing WG: https://github.com/orgs/k8snetworkplumbingwg/packages.
This will allow us to automatically publish new images to registry and users will have official public repo with new versions
There is a fix merged since the last release that we would like to incorporate in the upcoming network-operator release.
So, requesting to make a new release of ib-sriov-cni
The process of standardizing infiniband GUID as CNI runtime config parameter has been complete.
PR Mellanox/ib-kubernetes#70 was the last missing piece.
We should consider deprecating "guid" CNI config so that only infinibandGUID
CNI runtime config is used.
Work items:
guid
CNI config as deprecatedNote: multus-cni supports infiniband-guid
network-attachment attribute as of V3.6
If RDMA application is written to communicate via native IB device instead of IPoIB, should we enable ib-sriov-cni in such case to configure VF GUID? If not, what will be the recommended way for VF GUID configuration?
I noticed there was a branch called licence
, any plan to merge the licence change to master branch?
I tried to create a pod with SRIOV net device (e.g. Mellanox IB), but the pod stuck in ContainerCreating. I configured 4 VFs on the IB interface of the host. I run device plugin pod and Multus CNI meta-plugin. but the SRIOV demo pod show ERROR
./multus-daemonset-thick-plugin.yml:125: image: ghcr.io/k8snetworkplumbingwg/multus-cni:v3.9.2-thick-amd64
n-MacBookPro:~/20-k8s-rdma-sriov/ib-sriov-cni/deployment/examples$ kubectl describe po my-test-pod-fnjk7
Name: my-test-pod-fnjk7
Namespace: default
Priority: 0
Node: s-113-2-35/10.113.2.35
Start Time: Tue, 22 Nov 2022 20:22:33 +0800
Labels: <none>
Annotations: cni.projectcalico.org/containerID: 848157aeb2b3549aa8e2fce419c8353989ecb98ad62b1c6513f46423492f6cfd
cni.projectcalico.org/podIP:
cni.projectcalico.org/podIPs:
k8s.v1.cni.cncf.io/networks: [{"name": "ib-sriov-network"}]
Status: Pending
IP:
IPs: <none>
Containers:
my-test-ctr:
Container ID:
Image: mellanox/rping-test
Image ID:
Port: <none>
Host Port: <none>
Command:
sh
-c
sleep 1000000
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
mellanox.com/mlnx_sriov_rdma_ib: 1
Requests:
mellanox.com/mlnx_sriov_rdma_ib: 1
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2clfq (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-2clfq:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 21s default-scheduler Successfully assigned default/my-test-pod-fnjk7 to s-113-2-35
Normal AddedInterface 21s multus Add eth0 [10.42.0.21/32] from k8s-pod-network
Warning FailedCreatePodSandBox 21s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "ef4b067661534edfacd217cb1ea3cb1b2cdd44f65ffc1067a59091a2ae6490be" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "ef4b067661534edfacd217cb1ea3cb1b2cdd44f65ffc1067a59091a2ae6490be" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name ef4b067661534edfacd217cb1ea3cb1b2cdd44f65ffc1067a59091a2ae6490be-net1]
Normal AddedInterface 20s multus Add eth0 [10.42.0.22/32] from k8s-pod-network
Normal AddedInterface 19s multus Add eth0 [10.42.0.23/32] from k8s-pod-network
Warning FailedCreatePodSandBox 19s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "3573ba2407bcf6bacb171e5e8b32980ff549a59de1bd8b119d89f6304ae69b7c" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "3573ba2407bcf6bacb171e5e8b32980ff549a59de1bd8b119d89f6304ae69b7c" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 3573ba2407bcf6bacb171e5e8b32980ff549a59de1bd8b119d89f6304ae69b7c-net1]
Warning FailedCreatePodSandBox 18s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "0cdbf8cb322a3156d88f04a52c2bea0fc51511ffa6d21b4db9aa4ae44dc858e2" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "0cdbf8cb322a3156d88f04a52c2bea0fc51511ffa6d21b4db9aa4ae44dc858e2" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 0cdbf8cb322a3156d88f04a52c2bea0fc51511ffa6d21b4db9aa4ae44dc858e2-net1]
Normal AddedInterface 18s multus Add eth0 [10.42.0.24/32] from k8s-pod-network
Warning FailedCreatePodSandBox 17s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "93bbd85125dc93d15558f34aa2693d13781db6d38905925814151160ef405dc9" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "93bbd85125dc93d15558f34aa2693d13781db6d38905925814151160ef405dc9" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 93bbd85125dc93d15558f34aa2693d13781db6d38905925814151160ef405dc9-net1]
Normal AddedInterface 17s multus Add eth0 [10.42.0.25/32] from k8s-pod-network
Warning FailedCreatePodSandBox 16s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "8ea7b9cda5014ae0e8a3f335903e83c542156c4ec8de84c80a627ef3c3473cb1" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "8ea7b9cda5014ae0e8a3f335903e83c542156c4ec8de84c80a627ef3c3473cb1" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 8ea7b9cda5014ae0e8a3f335903e83c542156c4ec8de84c80a627ef3c3473cb1-net1]
Normal AddedInterface 16s multus Add eth0 [10.42.0.26/32] from k8s-pod-network
Warning FailedCreatePodSandBox 15s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "922f59df03433b78b31201f685867ac475fcb96c5b4791eecd642fe87b5ae365" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "922f59df03433b78b31201f685867ac475fcb96c5b4791eecd642fe87b5ae365" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 922f59df03433b78b31201f685867ac475fcb96c5b4791eecd642fe87b5ae365-net1]
Normal AddedInterface 15s multus Add eth0 [10.42.0.27/32] from k8s-pod-network
Normal AddedInterface 14s multus Add eth0 [10.42.0.28/32] from k8s-pod-network
Warning FailedCreatePodSandBox 14s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "68c5c26e73706571b562dfa035e6b53e848f7cc18c85b8a3995f0a2a3c338b97" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "68c5c26e73706571b562dfa035e6b53e848f7cc18c85b8a3995f0a2a3c338b97" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 68c5c26e73706571b562dfa035e6b53e848f7cc18c85b8a3995f0a2a3c338b97-net1]
Warning FailedCreatePodSandBox 13s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "60fda0e94bf41698460e2406a00d6443299a9b176da7ed8004f39adfc2bb16e0" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "60fda0e94bf41698460e2406a00d6443299a9b176da7ed8004f39adfc2bb16e0" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 60fda0e94bf41698460e2406a00d6443299a9b176da7ed8004f39adfc2bb16e0-net1]
Normal AddedInterface 12s multus Add eth0 [10.42.0.29/32] from k8s-pod-network
Warning FailedCreatePodSandBox 12s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "777d1178ca6d8681b1f0f43780fb357c0dce74a6905c94337c2f07ef9a5c9c36" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to set up pod "my-test-pod-fnjk7_default" network: [default/my-test-pod-fnjk7/:sriov-network]: error adding container to network "sriov-network": infiniBand SRI-OV CNI failed to configure VF "VF ib4 GUID is not valid", failed to clean up sandbox container "777d1178ca6d8681b1f0f43780fb357c0dce74a6905c94337c2f07ef9a5c9c36" network for pod "my-test-pod-fnjk7": networkPlugin cni failed to teardown pod "my-test-pod-fnjk7_default" network: delegateDel: error invoking DelegateDel - "ib-sriov": error in getting result from DelNetwork: error reading cached NetConf in /var/lib/cni/ib-sriov with name 777d1178ca6d8681b1f0f43780fb357c0dce74a6905c94337c2f07ef9a5c9c36-net1]
Normal AddedInterface 11s multus Add eth0 [10.42.0.30/32] from k8s-pod-network
The device plugin can detect the SRIOV net device on the host (node s-113-2-35 in my experiment), the output is shown in the following:
-MacBookPro:~/20-k8s-rdma-sriov/multus-cni/deployments$ kubectl get node s-113-2-35 -o json | jq '.status.allocatable'
{
"cpu": "128",
"ephemeral-storage": "5169411933432",
"hugepages-1Gi": "0",
"hugepages-2Mi": "0",
"mellanox.com/mlnx_sriov_rdma_ib": "4",
"memory": "528110968Ki",
"pods": "110"
}
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: ib-sriov-network
annotations:
k8s.v1.cni.cncf.io/resourceName: mellanox.com/mlnx_sriov_rdma_ib
spec:
config: '{
"type": "ib-sriov",
"cniVersion": "0.3.1",
"name": "sriov-network",
"ipam": {
"type": "host-local",
"subnet": "192.168.217.0/24",
"routes": [{
"dst": "0.0.0.0/0"
}],
"gateway": "192.168.217.1"
}
}'
apiVersion: v1
kind: ConfigMap
metadata:
name: sriovdp-config
namespace: kube-system
data:
config.json: |
{
"resourceList": [{
"resourcePrefix": "mellanox.com",
"resourceName": "mlnx_sriov_rdma_ib",
"selectors": {
"isRdma": true,
"vendors": ["15b3"],
"devices": ["101c"],
"drivers": ["mlx5_core"]
}
}
]
}
n-MacBookPro:~/20-k8s-rdma-sriov/multus-cni/deployments$ kubectl -n kube-system logs kube-sriov-device-plugin-amd64-bpwlk
I1122 11:59:59.507695 1 manager.go:51] Using Kubelet Plugin Registry Mode
I1122 11:59:59.508691 1 main.go:44] resource manager reading configs
I1122 11:59:59.508739 1 manager.go:79] raw ResourceList: {
"resourceList": [{
"resourcePrefix": "mellanox.com",
"resourceName": "mlnx_sriov_rdma_ib",
"selectors": {
"isRdma": true,
"vendors": ["15b3"],
"devices": ["101c"],
"drivers": ["mlx5_core"]
}
}
]
}
I1122 11:59:59.508875 1 factory.go:166] net device selector for resource mlnx_sriov_rdma_ib is &{DeviceSelectors:{Vendors:[15b3] Devices:[101c] Drivers:[mlx5_core] PciAddresses:[]} PfNames:[] RootDevices:[] LinkTypes:[] DDPProfiles:[] IsRdma:true NeedVhostNet:false}
I1122 11:59:59.508902 1 manager.go:99] unmarshalled ResourceList: [{ResourcePrefix:mellanox.com ResourceName:mlnx_sriov_rdma_ib DeviceType:netDevice Selectors:0xc00000cd38 SelectorObj:0xc000375380}]
I1122 11:59:59.508960 1 manager.go:200] validating resource name "mellanox.com/mlnx_sriov_rdma_ib"
I1122 11:59:59.508968 1 main.go:60] Discovering host devices
I1122 11:59:59.589424 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c2:00.0 02 Intel Corporation Ethernet Controller X710 for 10GbE SFP+
I1122 11:59:59.589938 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c2:00.1 02 Intel Corporation Ethernet Controller X710 for 10GbE SFP+
I1122 11:59:59.590256 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.0 02 Mellanox Technolo... MT28908 Family [ConnectX-6]
I1122 11:59:59.591462 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.1 02 Mellanox Technolo... MT28908 Family [ConnectX-6]
I1122 11:59:59.591704 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.2 02 Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.591894 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.3 02 Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.592053 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.4 02 Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.592203 1 netDeviceProvider.go:84] netdevice AddTargetDevices(): device found: 0000:c3:00.5 02 Mellanox Technolo... MT28908 Family [ConnectX-6 Virtual Fu...
I1122 11:59:59.592383 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:01:00.0 12 unknown unknown
I1122 11:59:59.592392 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:22:00.0 12 unknown unknown
I1122 11:59:59.592397 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:41:00.0 12 unknown unknown
I1122 11:59:59.592403 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:61:00.0 12 unknown unknown
I1122 11:59:59.592407 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:81:00.0 12 unknown unknown
I1122 11:59:59.592412 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:a1:00.0 12 unknown unknown
I1122 11:59:59.592417 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:c1:00.0 12 unknown unknown
I1122 11:59:59.592421 1 accelDeviceProvider.go:82] accelerator AddTargetDevices(): device found: 0000:e1:00.0 12 unknown unknown
I1122 11:59:59.592429 1 main.go:66] Initializing resource servers
I1122 11:59:59.592731 1 manager.go:105] number of config: 1
I1122 11:59:59.592739 1 manager.go:109]
I1122 11:59:59.592742 1 manager.go:110] Creating new ResourcePool: mlnx_sriov_rdma_ib
I1122 11:59:59.592746 1 manager.go:111] DeviceType: netDevice
W1122 11:59:59.592779 1 pciNetDevice.go:55] RDMA resources for 0000:c2:00.0 not found. Are RDMA modules loaded?
I1122 11:59:59.593104 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c2:00.0. error getting devlink device attributes for net device 0000:c2:00.0 no such device
W1122 11:59:59.593215 1 pciNetDevice.go:55] RDMA resources for 0000:c2:00.1 not found. Are RDMA modules loaded?
I1122 11:59:59.593362 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c2:00.1. error getting devlink device attributes for net device 0000:c2:00.1 no such device
I1122 11:59:59.594005 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.1. <nil>
I1122 11:59:59.596385 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.2. <nil>
I1122 11:59:59.597465 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.3. <nil>
I1122 11:59:59.598273 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.4. <nil>
I1122 11:59:59.599262 1 utils.go:71] Devlink query for eswitch mode is not supported for device 0000:c3:00.5. <nil>
I1122 11:59:59.599408 1 factory.go:106] device added: [pciAddr: 0000:c3:00.2, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599417 1 factory.go:106] device added: [pciAddr: 0000:c3:00.3, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599423 1 factory.go:106] device added: [pciAddr: 0000:c3:00.4, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599428 1 factory.go:106] device added: [pciAddr: 0000:c3:00.5, vendor: 15b3, device: 101c, driver: mlx5_core]
I1122 11:59:59.599446 1 manager.go:139] New resource server is created for mlnx_sriov_rdma_ib ResourcePool
I1122 11:59:59.599454 1 main.go:72] Starting all servers...
I1122 11:59:59.599885 1 server.go:199] starting mlnx_sriov_rdma_ib device plugin endpoint at: mellanox.com_mlnx_sriov_rdma_ib.sock
I1122 11:59:59.602783 1 server.go:226] mlnx_sriov_rdma_ib device plugin endpoint started serving
I1122 11:59:59.602805 1 main.go:77] All servers started.
I1122 11:59:59.602811 1 main.go:78] Listening for term signals
I1122 12:00:00.175755 1 server.go:110] Plugin: mellanox.com_mlnx_sriov_rdma_ib.sock gets registered successfully at Kubelet
I1122 12:00:00.175875 1 server.go:134] ListAndWatch(mlnx_sriov_rdma_ib) invoked
I1122 12:00:00.175890 1 server.go:142] ListAndWatch(mlnx_sriov_rdma_ib): send devices &ListAndWatchResponse{Devices:[]*Device{&Device{ID:0000:c3:00.4,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:c3:00.5,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:c3:00.2,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},&Device{ID:0000:c3:00.3,Health:Healthy,Topology:&TopologyInfo{Nodes:[]*NUMANode{&NUMANode{ID:1,},},},},},}
I1122 12:04:42.983933 1 server.go:119] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:c3:00.3],},},}
I1122 12:04:42.984024 1 netResourcePool.go:51] GetDeviceSpecs(): for devices: [0000:c3:00.3]
I1122 12:04:42.984044 1 pool_stub.go:97] GetEnvs(): for devices: [0000:c3:00.3]
I1122 12:04:42.984052 1 pool_stub.go:113] GetMounts(): for devices: [0000:c3:00.3]
I1122 12:04:42.984059 1 server.go:128] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_MELLANOX_COM_MLNX_SRIOV_RDMA_IB: 0000:c3:00.3,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/infiniband/issm3,HostPath:/dev/infiniband/issm3,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/umad3,HostPath:/dev/infiniband/umad3,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/uverbs3,HostPath:/dev/infiniband/uverbs3,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/rdma_cm,HostPath:/dev/infiniband/rdma_cm,Permissions:rwm,},},Annotations:map[string]string{},},},}
I1122 12:22:33.340229 1 server.go:119] Allocate() called with &AllocateRequest{ContainerRequests:[]*ContainerAllocateRequest{&ContainerAllocateRequest{DevicesIDs:[0000:c3:00.4],},},}
I1122 12:22:33.340326 1 netResourcePool.go:51] GetDeviceSpecs(): for devices: [0000:c3:00.4]
I1122 12:22:33.340347 1 pool_stub.go:97] GetEnvs(): for devices: [0000:c3:00.4]
I1122 12:22:33.340355 1 pool_stub.go:113] GetMounts(): for devices: [0000:c3:00.4]
I1122 12:22:33.340362 1 server.go:128] AllocateResponse send: &AllocateResponse{ContainerResponses:[]*ContainerAllocateResponse{&ContainerAllocateResponse{Envs:map[string]string{PCIDEVICE_MELLANOX_COM_MLNX_SRIOV_RDMA_IB: 0000:c3:00.4,},Mounts:[]*Mount{},Devices:[]*DeviceSpec{&DeviceSpec{ContainerPath:/dev/infiniband/issm4,HostPath:/dev/infiniband/issm4,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/umad4,HostPath:/dev/infiniband/umad4,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/uverbs4,HostPath:/dev/infiniband/uverbs4,Permissions:rwm,},&DeviceSpec{ContainerPath:/dev/infiniband/rdma_cm,HostPath:/dev/infiniband/rdma_cm,Permissions:rwm,},},Annotations:map[string]string{},},},}
ib-sriov-cni shall be able to run without ib-kubernetes be deployed for simple case such as InfiniBand network in default partition.
Currently ib-sriov-cni checks for existence of mellanox.infiniband.app
annotation in CNI-Args and returns immediately if not annotated. This assumes ib-kubernetes be deployed when using ib-sriov-cni, which is not always the case.
we should add a section explaining this.
the configuration needs to be persistent on reboot using ib_core mod param
cat /etc/modprobe.d/ib_core.conf
# Set netns to exclusive mode for namespace isolation
options ib_core netns_mode=0
Hi,
I tried to use QEMU VFIO to assign the VFs of the IB interface to a VM in KubeVirt. But the VFIO resource cannot be detected or advertised in kubevirt. I am not sure if the ib-sriov-cni-deamonset supports VFIO resource? any comment and suggestion are appreciated!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.