Comments (5)
Hello! Thanks for your detailed question and for diving deep into the YOLOv5 code! 🌟
The line if im.shape[0] < 5
that you're referring to is indeed checking the shape of the image tensor. In YOLOv5, images are typically manipulated in the format CHW (Channels, Height, Width) after transformations. This specific check is to determine if the image is in CHW format (common in deep learning frameworks like PyTorch) rather than the conventional HWC format used by OpenCV and PIL. If the first dimension (which would be channels in CHW) is less than 5, it likely indicates that the image is in CHW format and needs to be transposed to HWC for certain operations or visualizations.
The condition im.shape[0] < 5
is used because no image channel should have less than 5 channels in typical scenarios (where RGB is 3 channels and RGBA is 4 channels). This is a quick way to infer the tensor layout.
Your suggestion im.shape[0] <= 3
would not be appropriate here, as it would incorrectly transpose images that are already in HWC format but have a height of 3 or less, which is rare but could theoretically occur.
I hope this clears up the confusion! Let me know if you have any more questions. Happy coding! 😊
from yolov5.
Thank you very much for your response! @glenn-jocher
Actually, I didn't think of the RGBA image format, and your explanation has given me inspiration. I have another small question. When I use the default training parameters (python train.py --data coco128.yaml --weights yolov5s.pt --img 640
), the shape format of im
at this point is [H, W, C]
instead of [C, H, W]
. Here is a screenshot of my DEBUG:
At this point, the conditional statement in the code if im.shape[0] < 5
is actually checking if H < 5
rather than C < 5
. I'm wondering if the code can be modified from if im.shape[0] < 5
to if im.ndim < 5
?
# Pre-process
n, ims = (len(ims), list(ims)) if isinstance(ims, (list, tuple)) else (1, [ims]) # number, list of images
shape0, shape1, files = [], [], [] # image and inference shapes, filenames
for i, im in enumerate(ims):
f = f"image{i}" # filename
if isinstance(im, (str, Path)): # filename or uri
im, f = Image.open(requests.get(im, stream=True).raw if str(im).startswith("http") else im), im
im = np.asarray(exif_transpose(im))
elif isinstance(im, Image.Image): # PIL Image
im, f = np.asarray(exif_transpose(im)), getattr(im, "filename", f) or f
files.append(Path(f).with_suffix(".jpg").name)
# if im.shape[0] < 5: # image in CHW
if im.ndim < 5: # 💡 This is the modification/change.
im = im.transpose((1, 2, 0)) # reverse dataloader .transpose(2, 0, 1)
im = im[..., :3] if im.ndim == 3 else cv2.cvtColor(im, cv2.COLOR_GRAY2BGR) # enforce 3ch input
s = im.shape[:2] # HWC
shape0.append(s) # image shape
g = max(size) / max(s) # gain
shape1.append([int(y * g) for y in s])
ims[i] = im if im.data.contiguous else np.ascontiguousarray(im) # update
shape1 = [make_divisible(x, self.stride) for x in np.array(shape1).max(0)] # inf shape
x = [letterbox(im, shape1, auto=False)[0] for im in ims] # pad
x = np.ascontiguousarray(np.array(x).transpose((0, 3, 1, 2))) # stack and BHWC to BCHW
x = torch.from_numpy(x).to(p.device).type_as(p) / 255 # uint8 to fp16/32
Thank you very much for your patience and response!
from yolov5.
Hello again!
I appreciate your follow-up question and the code snippet you've provided. The suggestion to use if im.ndim < 5
wouldn't quite address the issue you're encountering. The .ndim
property checks the number of dimensions in the array, which for images will typically be 3 (height, width, channels), regardless of the order (HWC or CHW).
The original intent of if im.shape[0] < 5
is to check if the image is in CHW format, assuming that no image height or width (in HWC format) would be less than 5 pixels, which is a reasonable assumption for the datasets typically used. This check is specifically designed to catch cases where the image might be in a format expected by PyTorch (CHW) rather than HWC.
If you're consistently finding that im
is in HWC format at this point in the code, it might be worth investigating earlier in the pipeline to ensure that images are being correctly transformed to CHW format where expected, especially before they are passed to model-related functions that expect this format.
For now, the existing check should suffice in most scenarios, but if you're encountering specific issues with image formats, you might need to add additional checks or transformations based on your particular use case or dataset.
Thank you for your keen observations, and feel free to reach out if you have more questions! 😊
from yolov5.
@glenn-jocher Thank you very much for your reply. If we directly use im.ndim < 5
, it would be too arbitrary and would overlook the difference between HWC and CHW. I appreciate your reminder.
To be honest, the method you have written is really great and can be applied to the majority of datasets. I suggest adding a comment after this code segment, as without any explanation, others might also find it confusing.
Overall, thank you very much for your reply! 😊
from yolov5.
@Le0v1n hello!
Thank you for your understanding and for the suggestion to add a comment for clarity. It's a great idea to help others who might be reviewing the code in the future. I'll pass this feedback along to the team to consider adding a descriptive comment in the next update.
We appreciate your engagement and thoughtful suggestions! If you have any more ideas or questions, feel free to share. Happy coding! 😊
from yolov5.
Related Issues (20)
- 🚀 Feature Request: Simplified Method for Changing Label Names in YOLOv5 Model HOT 2
- where is yolov5 v7.0 --trian in export.py? HOT 2
- MESSES MY SYSTEM HOT 6
- Per Detection class accuracy on validation set HOT 4
- how to find why mAP suddenly increased HOT 4
- Parameters Fusion HOT 8
- Parameters Fusion HOT 1
- A question about bbox normalization HOT 2
- Unable to train model on VisDrone HOT 6
- Author, do you have a complete Python version that reads the engine model of Tensorrt to infer strength segmentation code, which is a simple version of the official inference code. It can be run in just one file without calling too many Python files or libraries HOT 1
- Android uses YOLOv5 segmentation HOT 3
- yolov5 Tensortt errors ? HOT 8
- about physical memory and virtual memory HOT 1
- _clip_augmented: clarifications required HOT 4
- After training my own dataset, the labels of pt model inference and engine model inference are inconsistent. HOT 3
- How to Show Real-Time Detection of Multiple Streams Using Titled Display Windows in Yolov5? HOT 4
- Class scores from TFlite model's output data don't add up to 1 HOT 4
- Model size is doubled when exporting model to onnx/torchscript HOT 2
- Labelling Objects Occluded objects in Extreme Environment HOT 4
- Trying to implement a custom dataset HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from yolov5.