<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

👋 Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

Well, it didn't work. There is no class named <code class="not

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

When running this code: <div class="highlight highlight-source-python notranslate

<div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto

Finetuning yolov5 on a custom dataloader,about ultralytics/yolov5

Comments (11)

glenn-jocher commented on September 9, 2024 1

@Ashwini4869 hello!

Thank you for reaching out with your question about fine-tuning YOLOv5 using a custom dataloader. It's great to see your interest in leveraging YOLOv5 in a more flexible manner. Here's a step-by-step guide to help you achieve this:

Custom Dataloader Setup

Dataset Preparation:
- Ensure your dataset is organized with images and corresponding labels. The labels should be in the YOLO format (i.e., a text file for each image with class and bounding box coordinates).

Custom Dataloader:

You can create a custom PyTorch Dataset class to load your images and labels. Here's an example:

import torch
from torch.utils.data import Dataset, DataLoader
from PIL import Image
import os

class CustomDataset(Dataset):
    def __init__(self, img_dir, label_dir, transform=None):
        self.img_dir = img_dir
        self.label_dir = label_dir
        self.transform = transform
        self.img_files = [f for f in os.listdir(img_dir) if f.endswith('.jpg')]

    def __len__(self):
        return len(self.img_files)

    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_files[idx])
        label_path = os.path.join(self.label_dir, self.img_files[idx].replace('.jpg', '.txt'))
        
        image = Image.open(img_path).convert("RGB")
        if self.transform:
            image = self.transform(image)
        
        with open(label_path, 'r') as f:
            labels = [line.strip().split() for line in f.readlines()]
            labels = torch.tensor(labels, dtype=torch.float32)
        
        return image, labels

# Example usage
dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

Fine-Tuning the Model

Load YOLOv5 Model:

Load the pre-trained YOLOv5 model using PyTorch.

from yolov5 import YOLOv5

model = YOLOv5('yolov5s.pt')  # Load a pre-trained model

Training Loop:

Implement the training loop to compute loss and optimize the model.

import torch.optim as optim

optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = model.loss  # Use YOLOv5's built-in loss function

for epoch in range(num_epochs):
    model.train()
    for images, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Tips

Transforms: Apply appropriate transformations to your images (e.g., resizing, normalization) to match the input requirements of YOLOv5.
Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

For more detailed information, you can refer to the YOLOv5 documentation.

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! 🚀

from yolov5.

glenn-jocher commented on September 9, 2024 1

Hello @Ashwini4869,

Thank you for your feedback and for pointing out the issues you encountered. Let's address these points to help you fine-tune YOLOv5 with a custom dataloader effectively.

Correcting the Approach

Loading the Model:
You're right; there isn't a YOLOv5 class directly in the module. Instead, you can load the model using the PyTorch Hub interface. Here's how you can do it:
```
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
```
Custom Training Loop:
Since the DetectMultiBackend class does not have a loss attribute, you need to define the loss computation manually. YOLOv5's training script uses a custom loss function defined in loss.py. You can integrate this into your custom training loop.

Custom Training Loop Example

Here's an updated example to guide you through the process:

import torch
import torch.optim as optim
from torch.utils.data import DataLoader
from yolov5.utils.loss import ComputeLoss
from yolov5.utils.datasets import LoadImagesAndLabels

# Load the model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# Define the custom dataset and dataloader
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, img_dir, label_dir, transform=None):
        self.img_dir = img_dir
        self.label_dir = label_dir
        self.transform = transform
        self.img_files = [f for f in os.listdir(img_dir) if f.endswith('.jpg')]

    def __len__(self):
        return len(self.img_files)

    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_files[idx])
        label_path = os.path.join(self.label_dir, self.img_files[idx].replace('.jpg', '.txt'))
        
        image = Image.open(img_path).convert("RGB")
        if self.transform:
            image = self.transform(image)
        
        with open(label_path, 'r') as f:
            labels = [line.strip().split() for line in f.readlines()]
            labels = torch.tensor(labels, dtype=torch.float32)
        
        return image, labels

dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

# Define optimizer and loss function
optimizer = optim.Adam(model.parameters(), lr=0.001)
compute_loss = ComputeLoss(model)  # YOLOv5's custom loss function

# Training loop
for epoch in range(num_epochs):
    model.train()
    for images, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(images)
        loss, _ = compute_loss(outputs, labels)
        loss.backward()
        optimizer.step()
        
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Tips

Transforms: Ensure you apply the necessary transformations to your images to match the input requirements of YOLOv5.
Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

For more detailed information, you can refer to the YOLOv5 documentation.

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! 🚀

from yolov5.

github-actions commented on September 9, 2024

👋 Hello @Ashwini4869, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 🚀

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 🚀!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

from yolov5.

Ashwini4869 commented on September 9, 2024

Well, it didn't work.

There is no class named YOLOv5 in module yolov5.
I tried to do the same after loading model from torch hub. But it had no loss attribute for DetectMultiBackend class.

Can you please suggest a working solution?

from yolov5.

Ashwini4869 commented on September 9, 2024

I loaded the model using:
model = torch.hub.load('ultralytics/yolov5','yolov5s',autoshape=False, pretrained=True)

However, when I try to define loss function from torch.utils.loss as:
compute_loss = ComputeLoss(model) # YOLOv5's custom loss function
It throws the following error: ```
`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[32], line 2
1 optimizer = optim.Adam(model.parameters(), lr = 0.001)
----> 2 compute_loss = ComputeLoss(model) # YOLOv5's custom loss function

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:97, in ComputeLoss.init(self, model, autobalance)
95 def init(self, model, autobalance=False):
96 device = next(model.parameters()).device # get model device
---> 97 h = model.hyp # hyperparameters
99 # Define criteria
100 BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1709, in Module.getattr(self, name)
1707 if name in modules:
1708 return modules[name]
-> 1709 raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")

AttributeError: 'DetectMultiBackend' object has no attribute 'hyp'```

What should I do to resolve this?

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Ashwini4869,

Thank you for providing the detailed error message. It looks like the issue arises because the DetectMultiBackend object does not have the hyp attribute, which is required by the ComputeLoss function.

To resolve this, you can manually set the hyperparameters (hyp) for the model. Here's how you can do it:

Step-by-Step Solution

Load the Model:
Ensure you load the model correctly using PyTorch Hub.

import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', autoshape=False, pretrained=True)

Set Hyperparameters:
Manually set the hyperparameters for the model. You can use the default hyperparameters from the YOLOv5 repository.

from yolov5.utils.general import check_yaml
from yolov5.utils.loss import ComputeLoss

# Load default hyperparameters
hyp = check_yaml('data/hyp.scratch.yaml')  # or use your custom hyperparameters file

# Manually set the hyperparameters to the model
model.hyp = hyp

Define the Loss Function:
Now you can define the loss function using the ComputeLoss class.
```
compute_loss = ComputeLoss(model)  # YOLOv5's custom loss function
```

Training Loop:
Implement the training loop to compute loss and optimize the model.

import torch.optim as optim
from torch.utils.data import DataLoader

# Define optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Custom dataset and dataloader (assuming you have already defined these)
dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

# Training loop
for epoch in range(num_epochs):
    model.train()
    for images, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(images)
        loss, _ = compute_loss(outputs, labels)
        loss.backward()
        optimizer.step()
        
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Tips

Transforms: Ensure you apply the necessary transformations to your images to match the input requirements of YOLOv5.
Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

If you encounter any further issues, please ensure you are using the latest version of the YOLOv5 repository and dependencies. You can update your local repository and dependencies by running:

git pull
pip install -r requirements.txt

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! 🚀

from yolov5.

Ashwini4869 commented on September 9, 2024

Setting model.hyp gives another error as it is assigned as a string. So, I set model.hyp by making it into a dictionary.
That issue is resolved, however, I run into another error.

TypeError                                 Traceback (most recent call last)
Cell In[38], [line 2](vscode-notebook-cell:?execution_count=38&line=2)
      [1](vscode-notebook-cell:?execution_count=38&line=1) optimizer = optim.Adam(model.parameters(), lr = 0.001)
----> [2](vscode-notebook-cell:?execution_count=38&line=2) compute_loss = ComputeLoss(model) # YOLOv5's custom loss function

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:111, in ComputeLoss.__init__(self, model, autobalance)
    [108](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:108) if g > 0:
    [109](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:109)     BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
--> [111](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:111) m = de_parallel(model).model[-1]  # Detect() module
    [112](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:112) self.balance = {3: [4.0, 1.0, 0.4]}.get(m.nl, [4.0, 1.0, 0.25, 0.06, 0.02])  # P3-P7
    [113](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:113) self.ssi = list(m.stride).index(16) if autobalance else 0  # stride 16 index

TypeError: 'DetectionModel' object is not subscriptable

Can you please give me a solution? I think you people should make an easy understandable guide to train via custom dataloader. Many people are looking for the same solution. Thank you!

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Ashwini4869,

Thank you for your patience and for providing detailed information about the issues you're encountering. I understand the challenges you're facing, and I'm here to help you resolve them.

Addressing the Error

The error you're seeing, TypeError: 'DetectionModel' object is not subscriptable, indicates that the model object is not being accessed correctly in the ComputeLoss function. This typically happens when the model structure is not as expected by the loss function.

To resolve this, let's ensure that the model's structure is correctly set up for the loss function. Here's a revised approach:

Load the Model:
Ensure you load the model correctly using PyTorch Hub.

import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', autoshape=False, pretrained=True)

Set Hyperparameters:
Manually set the hyperparameters for the model. You can use the default hyperparameters from the YOLOv5 repository.

from yolov5.utils.general import check_yaml
from yolov5.utils.loss import ComputeLoss

# Load default hyperparameters
hyp = check_yaml('data/hyp.scratch.yaml')  # or use your custom hyperparameters file

# Manually set the hyperparameters to the model
model.hyp = hyp

Ensure Model Structure:
Make sure the model's structure is compatible with the ComputeLoss function.

from yolov5.models.yolo import Model

# Ensure the model is of type Model
if not isinstance(model, Model):
    model = Model(cfg='models/yolov5s.yaml', ch=3, nc=80).to(device)
    model.load_state_dict(torch.load('yolov5s.pt')['model'])
    model.hyp = hyp

Define the Loss Function:
Now you can define the loss function using the ComputeLoss class.
```
compute_loss = ComputeLoss(model)  # YOLOv5's custom loss function
```

Training Loop:
Implement the training loop to compute loss and optimize the model.

import torch.optim as optim
from torch.utils.data import DataLoader

# Define optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Custom dataset and dataloader (assuming you have already defined these)
dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

# Training loop
for epoch in range(num_epochs):
    model.train()
    for images, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(images)
        loss, _ = compute_loss(outputs, labels)
        loss.backward()
        optimizer.step()
        
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Resources

I understand the need for a more comprehensive guide on training with a custom dataloader. The YOLOv5 community and the Ultralytics team are continuously working on improving documentation and resources. In the meantime, you can refer to the YOLOv5 documentation and the Ultralytics GitHub Discussions for more insights and community support.

If you encounter any further issues, please ensure you are using the latest version of the YOLOv5 repository and dependencies. You can update your local repository and dependencies by running:

git pull
pip install -r requirements.txt

Thank you for your understanding and patience. If you have any further questions, feel free to ask. Happy fine-tuning! 🚀

from yolov5.

Ashwini4869 commented on September 9, 2024

When running this code:

if not isinstance(model,Model):
    model = Model(cfg='yolov5/models/yolov5s.yaml',ch=3,nc=80).to('cuda')
    model.load_state_dict(torch.load('yolov5s.pt')['model'])
    model.hyp = hyp

I get the following error:

TypeError                                 Traceback (most recent call last)
Cell In[14], [line 3](vscode-notebook-cell:?execution_count=14&line=3)
      [1](vscode-notebook-cell:?execution_count=14&line=1) if not isinstance(model,Model):
      [2](vscode-notebook-cell:?execution_count=14&line=2)     model = Model(cfg='yolov5/models/yolov5s.yaml',ch=3,nc=80).to('cuda')
----> [3](vscode-notebook-cell:?execution_count=14&line=3)     model.load_state_dict(torch.load('yolov5s.pt')['model'])
      [4](vscode-notebook-cell:?execution_count=14&line=4)     model.hyp = hyp

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2140, in Module.load_state_dict(self, state_dict, strict, assign)
   [2105](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2105) r"""Copy parameters and buffers from :attr:`state_dict` into this module and its descendants.
   [2106](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2106) 
   [2107](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2107) If :attr:`strict` is ``True``, then
   (...)
   [2137](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2137)     ``RuntimeError``.
   [2138](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2138) """
   [2139](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2139) if not isinstance(state_dict, Mapping):
-> [2140](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2140)     raise TypeError(f"Expected state_dict to be dict-like, got {type(state_dict)}.")
   [2142](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2142) missing_keys: List[str] = []
   [2143](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2143) unexpected_keys: List[str] = []

TypeError: Expected state_dict to be dict-like, got <class 'models.yolo.DetectionModel'>.

Can you please test the solution yourself before suggesting it?

from yolov5.

Ashwini4869 commented on September 9, 2024

import torch
from torchvision import transforms as transforms
from torch.utils.data import Dataset,DataLoader
import torchvision
from torch.nn.utils.rnn import pad_sequence
from PIL import Image
import os
import matplotlib.pyplot as plt
import torch.optim as optim
# yolov5 imports
from yolov5.utils.loss import ComputeLoss
from yolov5.utils.general import check_yaml
from yolov5.models.yolo import Model
import numpy as np
%matplotlib inline
Custom Dataset Class
class WIDERFaceDataset(Dataset):
    def __init__(self,img_dir, annotations_file,transform=None):
        self.img_dir = img_dir
        self.annotations = self.parse_annotations(annotations_file)
        self.transform = transform
        
    def parse_annotations(self, annotations_file):
        annotations=[]
        with open(annotations_file,'r') as f:
            lines = f.readlines()
            i = 0
            while i < len(lines):
                img_path = lines[i].strip()
                num_boxes = int(lines[i+1].strip())
                boxes = []
                for j in range(max(num_boxes,1)):
                    box = list(map(int, lines[i+2+j].strip().split()))
                    image = Image.open(self.img_dir + img_path)
                    img_width, img_height = image.size
                    x_min, y_min, width, height = box[:4]
                    label = self.convert_to_yolo_label(x_min,y_min, width, height, img_width, img_height)
                    boxes.append(label)
                annotations.append({'image':img_path,'labels':torch.tensor(boxes,dtype=torch.float32)})
                # annotations.append({'image':img_path,'labels':boxes})
                i +=2 + max(num_boxes,1)
        return annotations
    
    def convert_to_yolo_label(self, x_min, y_min, width, height, img_width, img_height):
        norm_center_x= (x_min + width/2)/img_width
        norm_center_y = (y_min + height/2)/img_height
        norm_width = width/img_width
        norm_height = height/img_height
        label=[0,norm_center_x,norm_center_y,norm_width,norm_height]
        return label 
    
    def __len__(self):
        return len(self.annotations)
    
    def __getitem__(self,idx):
        img_path = os.path.join(self.img_dir, self.annotations[idx]['image'])
        labels = self.annotations[idx]['labels']
        
        image = Image.open(img_path).convert('RGB')
        target = torch.zeros((len(labels), 6))
        target[:, 1:] = labels
         
        return self.transform(image), target
Creating Dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize((640,640))
])
train_dataset = WIDERFaceDataset(img_dir=r'../Dataset/WIDER_train/images/', annotations_file=r'../Dataset/WIDER_train/wider_face_train_bbx_gt.txt',transform=transform)

valid_dataset = WIDERFaceDataset(img_dir=r"../Dataset/WIDER_val/images/", annotations_file=r'../Dataset/WIDER_val/wider_face_val_bbx_gt.txt',transform=transform)
train_dataset[2][1]
Creating Dataloader
def collate_fn(batch):
    images, labels = zip(*batch)
    
    images = torch.stack(images, 0)

    max_num_boxes = max(label.size(0) for label in labels)
    padded_labels = []
    for label in labels:
        num_boxes = label.size(0)
        padding = (0, 0, 0, max_num_boxes-num_boxes)
        padded_labels.append(torch.nn.functional.pad(label,padding))
    padded_labels = torch.stack(padded_labels, 0 )

    return images, padded_labels
train_dataloader = DataLoader(train_dataset, batch_size=16, shuffle=True, collate_fn= collate_fn)
valid_dataloader = DataLoader(valid_dataset, batch_size=32, shuffle=True, collate_fn= collate_fn)
Loading model
def show_batch(sample_batched):
    images_batch, labels_batch = sample_batched
    batch_size = len(images_batch)
    im_size = images_batch.size(2)
    
    grid = torchvision.utils.make_grid(images_batch)
    grid = grid.clip(0,1)
    plt.imshow(grid.numpy().transpose((1,2,0)))
    plt.title("Batch from dataloader")    
for i, sample_batched in enumerate(train_dataloader):
    print(f"Batch {i + 1}")
    images_batch, labels_batch = sample_batched
    print("Image batch shape:", images_batch.size())
    print("Labels batch shape:", [label.size() for label in labels_batch])
    
    if i == 0:  # Display the first batch
        show_batch(sample_batched)
        plt.axis('off')
        plt.show()
    break
Loading model
model = torch.load("yolov5s.pt",map_location='cuda')
model = model['model']
compute_loss = ComputeLoss(model) # YOLOv5's custom loss function
optimizer = optim.Adam(model.parameters(), lr = 0.001)
Hyperparameters
num_epochs = 1
Training/Validation Loop
for epoch in range(num_epochs):
    # Training 
    training_loss = 0.0
    model.train()
    for images, targets in train_dataloader:
        images = images.to('cuda')
        images = images.half()
        targets = targets.to('cuda')
        optimizer.zero_grad()
        outputs = model(images)
        print(outputs[0])
        print(targets)
        loss,_ = compute_loss(outputs, targets)
        loss.backward()
        optimizer.step()
        training_loss +=loss.item() * images.size(0)
    
    training_loss /= len(train_dataloader)
    print(f"Epoch [{epoch+1}/{num_epochs}, Loss: {training_loss:.4f}]")

Error:

[11](vscode-notebook-cell:?execution_count=165&line=11) print(outputs[0])
     [12](vscode-notebook-cell:?execution_count=165&line=12) print(targets)
---> [13](vscode-notebook-cell:?execution_count=165&line=13) loss,_ = compute_loss(outputs, targets)
     [14](vscode-notebook-cell:?execution_count=165&line=14) loss.backward()
     [15](vscode-notebook-cell:?execution_count=165&line=15) optimizer.step()

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:125, in ComputeLoss.__call__(self, p, targets)
    [123](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:123) lbox = torch.zeros(1, device=self.device)  # box loss
    [124](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:124) lobj = torch.zeros(1, device=self.device)  # object loss
--> [125](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:125) tcls, tbox, indices, anchors = self.build_targets(p, targets)  # targets
    [127](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:127) # Losses
    [128](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:128) for i, pi in enumerate(p):  # layer index, layer predictions

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:183, in ComputeLoss.build_targets(self, p, targets)
    [181](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:181) gain = torch.ones(7, device=self.device)  # normalized to gridspace gain
    [182](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:182) ai = torch.arange(na, device=self.device).float().view(na, 1).repeat(1, nt)  # same as .repeat_interleave(nt)
--> [183](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:183) targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None]), 2)  # append anchor indices
    [185](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:185) g = 0.5  # bias
    [186](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:186) off = torch.tensor(
    [187](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:187)     [
    [188](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:188)         [0, 0],
   (...)
    [194](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:194)     ],
    [195](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:195)     device=self.device).float() * g  # offsets

RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 48 but got size 3 for tensor number 1 in the list.

Where is the error now?
@glenn-jocher

from yolov5.

glenn-jocher commented on September 9, 2024

Hello @Ashwini4869,

Thank you for sharing the detailed code and error message. It looks like the issue is related to the shape of the targets tensor when passed to the ComputeLoss function. The error message indicates a mismatch in tensor sizes, which usually happens when the target labels are not in the expected format.

Addressing the Error

The ComputeLoss function expects the targets to be in a specific format. Let's ensure that the targets are correctly formatted before passing them to the loss function.

Step-by-Step Solution

Ensure Target Format:
The targets should be in the format [batch_size, num_boxes, 6], where each box is represented by [class, x_center, y_center, width, height, anchor_index].

Modify the Dataset Class:
Ensure that the targets are correctly formatted in the __getitem__ method of your custom dataset class.

class WIDERFaceDataset(Dataset):
    def __init__(self, img_dir, annotations_file, transform=None):
        self.img_dir = img_dir
        self.annotations = self.parse_annotations(annotations_file)
        self.transform = transform

    def parse_annotations(self, annotations_file):
        annotations = []
        with open(annotations_file, 'r') as f:
            lines = f.readlines()
            i = 0
            while i < len(lines):
                img_path = lines[i].strip()
                num_boxes = int(lines[i + 1].strip())
                boxes = []
                for j in range(max(num_boxes, 1)):
                    box = list(map(int, lines[i + 2 + j].strip().split()))
                    image = Image.open(self.img_dir + img_path)
                    img_width, img_height = image.size
                    x_min, y_min, width, height = box[:4]
                    label = self.convert_to_yolo_label(x_min, y_min, width, height, img_width, img_height)
                    boxes.append(label)
                annotations.append({'image': img_path, 'labels': torch.tensor(boxes, dtype=torch.float32)})
                i += 2 + max(num_boxes, 1)
        return annotations

    def convert_to_yolo_label(self, x_min, y_min, width, height, img_width, img_height):
        norm_center_x = (x_min + width / 2) / img_width
        norm_center_y = (y_min + height / 2) / img_height
        norm_width = width / img_width
        norm_height = height / img_height
        label = [0, norm_center_x, norm_center_y, norm_width, norm_height]
        return label

    def __len__(self):
        return len(self.annotations)

    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.annotations[idx]['image'])
        labels = self.annotations[idx]['labels']

        image = Image.open(img_path).convert('RGB')
        if self.transform:
            image = self.transform(image)

        target = torch.zeros((len(labels), 6))
        target[:, 1:] = labels

        return image, target

Modify the Collate Function:
Ensure that the collate function correctly pads the targets.

def collate_fn(batch):
    images, labels = zip(*batch)

    images = torch.stack(images, 0)

    max_num_boxes = max(label.size(0) for label in labels)
    padded_labels = []
    for label in labels:
        num_boxes = label.size(0)
        padding = (0, 0, 0, max_num_boxes - num_boxes)
        padded_labels.append(torch.nn.functional.pad(label, padding))
    padded_labels = torch.stack(padded_labels, 0)

    return images, padded_labels

Training Loop:
Ensure the training loop correctly processes the images and targets.

import torch.optim as optim
from torch.utils.data import DataLoader

# Define optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Custom dataset and dataloader
train_dataset = WIDERFaceDataset(img_dir='path/to/images', annotations_file='path/to/annotations', transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=16, shuffle=True, collate_fn=collate_fn)

# Training loop
for epoch in range(num_epochs):
    model.train()
    training_loss = 0.0
    for images, targets in train_dataloader:
        images = images.to('cuda')
        targets = targets.to('cuda')
        optimizer.zero_grad()
        outputs = model(images)
        loss, _ = compute_loss(outputs, targets)
        loss.backward()
        optimizer.step()
        training_loss += loss.item() * images.size(0)

    training_loss /= len(train_dataloader.dataset)
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {training_loss:.4f}")

Additional Tips

Transforms: Ensure you apply the necessary transformations to your images to match the input requirements of YOLOv5.
Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

If you encounter any further issues, please ensure you are using the latest version of the YOLOv5 repository and dependencies. You can update your local repository and dependencies by running:

git pull
pip install -r requirements.txt

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! 🚀

from yolov5.

Comments (11)

Custom Dataloader Setup

Fine-Tuning the Model

Additional Tips

Correcting the Approach

Custom Training Loop Example

Additional Tips

Requirements

Environments

Status

Introducing YOLOv8 🚀

Step-by-Step Solution

Additional Tips

Addressing the Error

Additional Resources

Addressing the Error

Step-by-Step Solution

Additional Tips

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org