kubevirt / device-plugin-manager Goto Github PK
View Code? Open in Web Editor NEWIncubating: A framework for writing Kubernetes Device Plugins
License: MIT License
Incubating: A framework for writing Kubernetes Device Plugins
License: MIT License
V1Beta1 plugin interface has some optional interface members which are not supported with the current device-plugin-manager implementation. my plugin implements these, and my intent is to use PreStartContainer to do some setup of my devices before handing them off to the container.
func (p *MikeDevicePlugin) GetDevicePluginOptions(context.Context, *pluginapi.Empty) (*pluginapi.DevicePluginOptions, error) {
glog.Infof("GetDevicePluginOptions")
return &pluginapi.DevicePluginOptions {
PreStartRequired: true,
}, nil
}
func (p *MikeDevicePlugin) PreStartContainer(context.Context, *pluginapi.PreStartContainerRequest) (*pluginapi.PreStartContainerResponse, error) {
glog.Infof("PreStartContainer: %s", p.mikedevicename)
return &pluginapi.PreStartContainerResponse{}, nil
}
However the way device-plugin-manager is registering with pluginapi, it does not call GetDevicePluginOptions and pass those options during the registration, and because of this, my plugin's PreStartContainer function is never called from pluginapi.
I've fixed this in a fork by calling GetDevicePluginOptions in plugin.go and passing it as part of the RegisterRequest which now results in calls to PreStartContainer.
options, err := dpi.DevicePluginImpl.GetDevicePluginOptions(context.Background(), &pluginapi.Empty{})
reqt := &pluginapi.RegisterRequest{
Version: pluginapi.Version,
Endpoint: path.Base(dpi.Socket),
ResourceName: dpi.ResourceName,
Options: options,
}
I'd like this change to be considered and can prepare a PR if there are no objections.
Starting K8S 1.25 the plugins now have to register first, and only then synchronously connect to the gRPC service being served by the plugin.
This is tracked on k8s issue kubernetes/kubernetes#112395.
I observe a constant 10 second delay in the serve
function (pkg/dpm/plugin.go
). The function uses GetServiceInfo
to poll the gRPC server for services. In our setups GetServiceInfo
always returns a single map element after the call to RegisterDevicePluginServer
. My assumption would be that if len(services) > 1
should be if len(services) >= 1
instead.
The Device Plugin API was moved from k8s.io/kubernetes/pkg/kubelet/apis/deviceplugin/v1beta1 to k8s.io/kubelet/pkg/apis/deviceplugin/v1beta1, which causes problems. It would be great to have a new release with the correction.
in: handleNew() in manager.go
What happened:
I got a panic in my application when using this code:
manager := dpm.NewManager(lister)
manager.Run()
StackTrace:
panic: runtime error: invalid memory address or nil pointer dereference
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1108918]
goroutine 49 [running]:
github.com/fsnotify/fsnotify.(*Watcher).Close(0x4000d039d8?)
/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:305 +0x18
panic({0x148e3c0?, 0x281dee0?})
/usr/local/go/src/runtime/panic.go:914 +0x218
github.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:296
github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1739526, 0x20}, {0x0, 0x0, 0x14d2840?})
/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:372 +0x2c
github.com/fsnotify/fsnotify.(*Watcher).Add(...)
/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:362
github.com/kubevirt/device-plugin-manager/pkg/dpm.(*Manager).Run(0x4000b9e1f0)
/go/pkg/mod/github.com/kubevirt/[email protected]/pkg/dpm/manager.go:55 +0x200
main.startDeviceManager(0x0?)
/go/src/github.com/keyval-dev/odigos/odiglet/cmd/main.go:110 +0x48c
created by main.main in goroutine 1
/go/src/github.com/keyval-dev/odigos/odiglet/cmd/main.go:50 +0x2bc
After investigating, I found that Manager Run()
function is attempting to create new watcher, but ignores the error coming back from this call.
fsWatcher, _ := fsnotify.NewWatcher()
defer fsWatcher.Close()
fsWatcher.Add(pluginapi.DevicePluginPath)
In my case, the call fails, and fsWatcher
is nil
which causes the above panic.
How to reproduce it (as minimally and precisely as possible):
To reproduce, open the k8s node terminal:
sudo sysctl -w fs.inotify.max_user_instances=1
Then, try to create a manager and it should panic as above
Additional context:
I think the Run()
function should return an error
when the fsnotify.NewWatcher()
fails
Environment:
It would be nice if the version in master that references k8s.io/kubernetes/pkg/kubelet/apis/deviceplugin/v1beta1 could be tagged. My dependency manager pulls v1.9.2 by default, which is using device plugins v1alpha1.
Should the latest tag be 1.19.2 instead of 0.19.2?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.