Thank you very much for your response! <a class="user-mention notranslate" data-hoverc

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

A minor query about the image channel number check using `im.shape[0] < 5` about yolov5 HOT 5 CLOSED

Le0v1n commented on June 17, 2024

A minor query about the image channel number check using `im.shape[0] < 5`

from yolov5.

Comments (5)

glenn-jocher commented on June 17, 2024

Hello! Thanks for your detailed question and for diving deep into the YOLOv5 code! 🌟

The line if im.shape[0] < 5 that you're referring to is indeed checking the shape of the image tensor. In YOLOv5, images are typically manipulated in the format CHW (Channels, Height, Width) after transformations. This specific check is to determine if the image is in CHW format (common in deep learning frameworks like PyTorch) rather than the conventional HWC format used by OpenCV and PIL. If the first dimension (which would be channels in CHW) is less than 5, it likely indicates that the image is in CHW format and needs to be transposed to HWC for certain operations or visualizations.

The condition im.shape[0] < 5 is used because no image channel should have less than 5 channels in typical scenarios (where RGB is 3 channels and RGBA is 4 channels). This is a quick way to infer the tensor layout.

Your suggestion im.shape[0] <= 3 would not be appropriate here, as it would incorrectly transpose images that are already in HWC format but have a height of 3 or less, which is rare but could theoretically occur.

I hope this clears up the confusion! Let me know if you have any more questions. Happy coding! 😊

from yolov5.

Le0v1n commented on June 17, 2024

Thank you very much for your response! @glenn-jocher

Actually, I didn't think of the RGBA image format, and your explanation has given me inspiration. I have another small question. When I use the default training parameters (python train.py --data coco128.yaml --weights yolov5s.pt --img 640), the shape format of im at this point is [H, W, C] instead of [C, H, W]. Here is a screenshot of my DEBUG:

At this point, the conditional statement in the code if im.shape[0] < 5 is actually checking if H < 5 rather than C < 5. I'm wondering if the code can be modified from if im.shape[0] < 5 to if im.ndim < 5?

# Pre-process
n, ims = (len(ims), list(ims)) if isinstance(ims, (list, tuple)) else (1, [ims])  # number, list of images
shape0, shape1, files = [], [], []  # image and inference shapes, filenames
for i, im in enumerate(ims):
	f = f"image{i}"  # filename
	if isinstance(im, (str, Path)):  # filename or uri
		im, f = Image.open(requests.get(im, stream=True).raw if str(im).startswith("http") else im), im
		im = np.asarray(exif_transpose(im))  
	elif isinstance(im, Image.Image):  # PIL Image
		im, f = np.asarray(exif_transpose(im)), getattr(im, "filename", f) or f
	files.append(Path(f).with_suffix(".jpg").name)
	# if im.shape[0] < 5:  # image in CHW
	if im.ndim < 5:  # 💡 This is the modification/change.
		im = im.transpose((1, 2, 0))  # reverse dataloader .transpose(2, 0, 1)
	im = im[..., :3] if im.ndim == 3 else cv2.cvtColor(im, cv2.COLOR_GRAY2BGR)  # enforce 3ch input
	s = im.shape[:2]  # HWC
	shape0.append(s)  # image shape
	g = max(size) / max(s)  # gain
	shape1.append([int(y * g) for y in s])
	ims[i] = im if im.data.contiguous else np.ascontiguousarray(im)  # update
shape1 = [make_divisible(x, self.stride) for x in np.array(shape1).max(0)]  # inf shape
x = [letterbox(im, shape1, auto=False)[0] for im in ims]  # pad
x = np.ascontiguousarray(np.array(x).transpose((0, 3, 1, 2)))  # stack and BHWC to BCHW
x = torch.from_numpy(x).to(p.device).type_as(p) / 255  # uint8 to fp16/32

Thank you very much for your patience and response!

from yolov5.

glenn-jocher commented on June 17, 2024

Hello again!

I appreciate your follow-up question and the code snippet you've provided. The suggestion to use if im.ndim < 5 wouldn't quite address the issue you're encountering. The .ndim property checks the number of dimensions in the array, which for images will typically be 3 (height, width, channels), regardless of the order (HWC or CHW).

The original intent of if im.shape[0] < 5 is to check if the image is in CHW format, assuming that no image height or width (in HWC format) would be less than 5 pixels, which is a reasonable assumption for the datasets typically used. This check is specifically designed to catch cases where the image might be in a format expected by PyTorch (CHW) rather than HWC.

If you're consistently finding that im is in HWC format at this point in the code, it might be worth investigating earlier in the pipeline to ensure that images are being correctly transformed to CHW format where expected, especially before they are passed to model-related functions that expect this format.

For now, the existing check should suffice in most scenarios, but if you're encountering specific issues with image formats, you might need to add additional checks or transformations based on your particular use case or dataset.

Thank you for your keen observations, and feel free to reach out if you have more questions! 😊

from yolov5.

Le0v1n commented on June 17, 2024

@glenn-jocher Thank you very much for your reply. If we directly use im.ndim < 5, it would be too arbitrary and would overlook the difference between HWC and CHW. I appreciate your reminder.

To be honest, the method you have written is really great and can be applied to the majority of datasets. I suggest adding a comment after this code segment, as without any explanation, others might also find it confusing.

Overall, thank you very much for your reply! 😊

from yolov5.

glenn-jocher commented on June 17, 2024

@Le0v1n hello!

Thank you for your understanding and for the suggestion to add a comment for clarity. It's a great idea to help others who might be reviewing the code in the future. I'll pass this feedback along to the team to consider adding a descriptive comment in the next update.

We appreciate your engagement and thoughtful suggestions! If you have any more ideas or questions, feel free to share. Happy coding! 😊

from yolov5.

A minor query about the image channel number check using `im.shape[0] < 5` about yolov5 HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent