Coder Social home page Coder Social logo

device-plugin-manager's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

device-plugin-manager's Issues

Support for v1beta1 PluginInterface GetDevicePluginOptions

V1Beta1 plugin interface has some optional interface members which are not supported with the current device-plugin-manager implementation. my plugin implements these, and my intent is to use PreStartContainer to do some setup of my devices before handing them off to the container.

func (p *MikeDevicePlugin) GetDevicePluginOptions(context.Context, *pluginapi.Empty) (*pluginapi.DevicePluginOptions, error) {
    glog.Infof("GetDevicePluginOptions")
    return &pluginapi.DevicePluginOptions {
        PreStartRequired: true,
    }, nil
}

func (p *MikeDevicePlugin) PreStartContainer(context.Context, *pluginapi.PreStartContainerRequest) (*pluginapi.PreStartContainerResponse, error) {
    glog.Infof("PreStartContainer: %s", p.mikedevicename)
    return &pluginapi.PreStartContainerResponse{}, nil
}

However the way device-plugin-manager is registering with pluginapi, it does not call GetDevicePluginOptions and pass those options during the registration, and because of this, my plugin's PreStartContainer function is never called from pluginapi.

I've fixed this in a fork by calling GetDevicePluginOptions in plugin.go and passing it as part of the RegisterRequest which now results in calls to PreStartContainer.

    options, err := dpi.DevicePluginImpl.GetDevicePluginOptions(context.Background(), &pluginapi.Empty{})
	reqt := &pluginapi.RegisterRequest{
		Version:      pluginapi.Version,
		Endpoint:     path.Base(dpi.Socket),
		ResourceName: dpi.ResourceName,
	        Options: options,
	}

I'd like this change to be considered and can prepare a PR if there are no objections.

10 Seconds delay upon plugin start

I observe a constant 10 second delay in the serve function (pkg/dpm/plugin.go). The function uses GetServiceInfo to poll the gRPC server for services. In our setups GetServiceInfo always returns a single map element after the call to RegisterDevicePluginServer. My assumption would be that if len(services) > 1 should be if len(services) >= 1 instead.

Device Plugin API was moved

The Device Plugin API was moved from k8s.io/kubernetes/pkg/kubelet/apis/deviceplugin/v1beta1 to k8s.io/kubelet/pkg/apis/deviceplugin/v1beta1, which causes problems. It would be great to have a new release with the correction.

error from fsnotify.NewWatcher() not checked, causing panic

What happened:
I got a panic in my application when using this code:

	manager := dpm.NewManager(lister)
	manager.Run()

StackTrace:

panic: runtime error: invalid memory address or nil pointer dereference
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1108918]

goroutine 49 [running]:
github.com/fsnotify/fsnotify.(*Watcher).Close(0x4000d039d8?)
	/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:305 +0x18
panic({0x148e3c0?, 0x281dee0?})
	/usr/local/go/src/runtime/panic.go:914 +0x218
github.com/fsnotify/fsnotify.(*Watcher).isClosed(...)
	/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:296
github.com/fsnotify/fsnotify.(*Watcher).AddWith(0x0, {0x1739526, 0x20}, {0x0, 0x0, 0x14d2840?})
	/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:372 +0x2c
github.com/fsnotify/fsnotify.(*Watcher).Add(...)
	/go/pkg/mod/github.com/fsnotify/[email protected]/backend_inotify.go:362
github.com/kubevirt/device-plugin-manager/pkg/dpm.(*Manager).Run(0x4000b9e1f0)
	/go/pkg/mod/github.com/kubevirt/[email protected]/pkg/dpm/manager.go:55 +0x200
main.startDeviceManager(0x0?)
	/go/src/github.com/keyval-dev/odigos/odiglet/cmd/main.go:110 +0x48c
created by main.main in goroutine 1
	/go/src/github.com/keyval-dev/odigos/odiglet/cmd/main.go:50 +0x2bc

After investigating, I found that Manager Run() function is attempting to create new watcher, but ignores the error coming back from this call.

	fsWatcher, _ := fsnotify.NewWatcher()
	defer fsWatcher.Close()
	fsWatcher.Add(pluginapi.DevicePluginPath)

In my case, the call fails, and fsWatcher is nil which causes the above panic.

How to reproduce it (as minimally and precisely as possible):

To reproduce, open the k8s node terminal:

sudo sysctl -w fs.inotify.max_user_instances=1

Then, try to create a manager and it should panic as above

Additional context:
I think the Run() function should return an error when the fsnotify.NewWatcher() fails

Environment:

  • github.com/kubevirt/device-plugin-manager v1.19.5 h1:nA9rPpQyWBNyrpqaZe2aW4PZfTBbyOgm16+O9q7FQts=

New tagged version/release for device plugins v1beta1

It would be nice if the version in master that references k8s.io/kubernetes/pkg/kubelet/apis/deviceplugin/v1beta1 could be tagged. My dependency manager pulls v1.9.2 by default, which is using device plugins v1alpha1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.