Coder Social home page Coder Social logo

Comments (5)

elezar avatar elezar commented on June 5, 2024 1

Thanks for digging that up. I don't quite recall why I had that nil check present there. Looking at the existing code, it obviously doesn't make sense.

I have created NVIDIA/go-gpuallocator#23 to address this in go-gpuallocator.

from k8s-device-plugin.

elezar avatar elezar commented on June 5, 2024

Looking at the Filter() call that is returning all devices when required is empty, this should only happen if required is nil:

// Filter filters out the selected devices from the list.
// If the supplied list of uuids is nil, no filtering is performed.
// Note that the specified uuids must exist in the list of devices.
func (d DeviceList) Filter(uuids []string) (DeviceList, error) {
	if uuids == nil {
		return d, nil
	}

	filtered := []*Device{}
	for _, uuid := range uuids {
		for _, device := range d {
			if device.UUID == uuid {
				filtered = append(filtered, device)
				break
			}
		}
		if len(filtered) == 0 || filtered[len(filtered)-1].UUID != uuid {
			return nil, fmt.Errorf("no device with uuid: %v", uuid)
		}
	}

	return filtered, nil
}

Why is required == nil in this case?

Looking at the type definition in k8s:

type ContainerPreferredAllocationRequest struct {
	// List of available deviceIDs from which to choose a preferred allocation
	AvailableDeviceIDs []string `protobuf:"bytes,1,rep,name=available_deviceIDs,json=availableDeviceIDs,proto3" json:"available_deviceIDs,omitempty"`
	// List of deviceIDs that must be included in the preferred allocation
	MustIncludeDeviceIDs []string `protobuf:"bytes,2,rep,name=must_include_deviceIDs,json=mustIncludeDeviceIDs,proto3" json:"must_include_deviceIDs,omitempty"`
	// Number of devices to include in the preferred allocation
	AllocationSize       int32    `protobuf:"varint,3,opt,name=allocation_size,json=allocationSize,proto3" json:"allocation_size,omitempty"`
	XXX_NoUnkeyedLiteral struct{} `json:"-"`
	XXX_sizecache        int32    `json:"-"`
}

We see that the json encoding includes omitempty. Does this affect the protobuf?

@klueska can you think of any reason why we should treat nil and []string{} differently from the perspective of the device plugin?

from k8s-device-plugin.

elezar avatar elezar commented on June 5, 2024

Assuming we could consider required == nil equivalent to required == []string{} we could apply the following diff:

diff --git a/internal/rm/nvml_manager.go b/internal/rm/nvml_manager.go
index 56f05429..a00d41cf 100644
--- a/internal/rm/nvml_manager.go
+++ b/internal/rm/nvml_manager.go
@@ -73,6 +73,9 @@ func NewNVMLResourceManagers(nvmllib nvml.Interface, config *spec.Config) ([]Res
 // GetPreferredAllocation runs an allocation algorithm over the inputs.
 // The algorithm chosen is based both on the incoming set of available devices and various config settings.
 func (r *nvmlResourceManager) GetPreferredAllocation(available, required []string, size int) ([]string, error) {
+	if required == nil {
+		required = []string{}
+	}
 	return r.getPreferredAllocation(available, required, size)
 }

to address this issue?

Would you be able to confirm this?

from k8s-device-plugin.

klueska avatar klueska commented on June 5, 2024

It looks like the Filter() function was added as part of:
https://github.com/NVIDIA/go-gpuallocator/pull/13/files#diff-7a10395c66058f91191f5b9ac49321a6e95a8332ea4098d3438e0e19b2b02fdeR151

The old logic that had this loop was:

+// Create a list of Devices from the specific set of GPU uuids passed in.
+func NewDevicesFrom(uuids []string) ([]*Device, error) {
+   devices, err := NewDevices()
+   if err != nil {
+       return nil, err
+   }
+
+   filtered := []*Device{}
+   for _, uuid := range uuids {
+       for _, device := range devices {
+           if device.UUID == uuid {
+               filtered = append(filtered, device)
+               break
+           }
+       }
+       if len(filtered) == 0 || filtered[len(filtered)-1].UUID != uuid {
+           return nil, fmt.Errorf("no device with uuid: %v", uuid)
+       }
+   }
+
+   return filtered, nil
+}

In general, it's completely reasonable for required to be nil (in fact this is the common case) because the kubelet doesn't want to predecide which GPUs should be included in the list of those returned.

from k8s-device-plugin.

klueska avatar klueska commented on June 5, 2024

That said, I think we could safely change:

	if uuids == nil {
		return d, nil
	}

to

	if len(uuids) == 0 {
		return d, nil
	}

without any issues.

from k8s-device-plugin.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.