Coder Social home page Coder Social logo

Comments (11)

glenn-jocher avatar glenn-jocher commented on September 9, 2024 1

@Ashwini4869 hello!

Thank you for reaching out with your question about fine-tuning YOLOv5 using a custom dataloader. It's great to see your interest in leveraging YOLOv5 in a more flexible manner. Here's a step-by-step guide to help you achieve this:

Custom Dataloader Setup

  1. Dataset Preparation:

    • Ensure your dataset is organized with images and corresponding labels. The labels should be in the YOLO format (i.e., a text file for each image with class and bounding box coordinates).
  2. Custom Dataloader:

    • You can create a custom PyTorch Dataset class to load your images and labels. Here's an example:
    import torch
    from torch.utils.data import Dataset, DataLoader
    from PIL import Image
    import os
    
    class CustomDataset(Dataset):
        def __init__(self, img_dir, label_dir, transform=None):
            self.img_dir = img_dir
            self.label_dir = label_dir
            self.transform = transform
            self.img_files = [f for f in os.listdir(img_dir) if f.endswith('.jpg')]
    
        def __len__(self):
            return len(self.img_files)
    
        def __getitem__(self, idx):
            img_path = os.path.join(self.img_dir, self.img_files[idx])
            label_path = os.path.join(self.label_dir, self.img_files[idx].replace('.jpg', '.txt'))
            
            image = Image.open(img_path).convert("RGB")
            if self.transform:
                image = self.transform(image)
            
            with open(label_path, 'r') as f:
                labels = [line.strip().split() for line in f.readlines()]
                labels = torch.tensor(labels, dtype=torch.float32)
            
            return image, labels
    
    # Example usage
    dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
    dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

Fine-Tuning the Model

  1. Load YOLOv5 Model:

    • Load the pre-trained YOLOv5 model using PyTorch.
    from yolov5 import YOLOv5
    
    model = YOLOv5('yolov5s.pt')  # Load a pre-trained model
  2. Training Loop:

    • Implement the training loop to compute loss and optimize the model.
    import torch.optim as optim
    
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    criterion = model.loss  # Use YOLOv5's built-in loss function
    
    for epoch in range(num_epochs):
        model.train()
        for images, labels in dataloader:
            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            
            print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Tips

  • Transforms: Apply appropriate transformations to your images (e.g., resizing, normalization) to match the input requirements of YOLOv5.
  • Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
  • Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

For more detailed information, you can refer to the YOLOv5 documentation.

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! πŸš€

from yolov5.

glenn-jocher avatar glenn-jocher commented on September 9, 2024 1

Hello @Ashwini4869,

Thank you for your feedback and for pointing out the issues you encountered. Let's address these points to help you fine-tune YOLOv5 with a custom dataloader effectively.

Correcting the Approach

  1. Loading the Model:
    You're right; there isn't a YOLOv5 class directly in the module. Instead, you can load the model using the PyTorch Hub interface. Here's how you can do it:

    import torch
    model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
  2. Custom Training Loop:
    Since the DetectMultiBackend class does not have a loss attribute, you need to define the loss computation manually. YOLOv5's training script uses a custom loss function defined in loss.py. You can integrate this into your custom training loop.

Custom Training Loop Example

Here's an updated example to guide you through the process:

import torch
import torch.optim as optim
from torch.utils.data import DataLoader
from yolov5.utils.loss import ComputeLoss
from yolov5.utils.datasets import LoadImagesAndLabels

# Load the model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# Define the custom dataset and dataloader
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, img_dir, label_dir, transform=None):
        self.img_dir = img_dir
        self.label_dir = label_dir
        self.transform = transform
        self.img_files = [f for f in os.listdir(img_dir) if f.endswith('.jpg')]

    def __len__(self):
        return len(self.img_files)

    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_files[idx])
        label_path = os.path.join(self.label_dir, self.img_files[idx].replace('.jpg', '.txt'))
        
        image = Image.open(img_path).convert("RGB")
        if self.transform:
            image = self.transform(image)
        
        with open(label_path, 'r') as f:
            labels = [line.strip().split() for line in f.readlines()]
            labels = torch.tensor(labels, dtype=torch.float32)
        
        return image, labels

dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

# Define optimizer and loss function
optimizer = optim.Adam(model.parameters(), lr=0.001)
compute_loss = ComputeLoss(model)  # YOLOv5's custom loss function

# Training loop
for epoch in range(num_epochs):
    model.train()
    for images, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(images)
        loss, _ = compute_loss(outputs, labels)
        loss.backward()
        optimizer.step()
        
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Tips

  • Transforms: Ensure you apply the necessary transformations to your images to match the input requirements of YOLOv5.
  • Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
  • Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

For more detailed information, you can refer to the YOLOv5 documentation.

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! πŸš€

from yolov5.

github-actions avatar github-actions commented on September 9, 2024

πŸ‘‹ Hello @Ashwini4869, thank you for your interest in YOLOv5 πŸš€! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a πŸ› Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Introducing YOLOv8 πŸš€

We're excited to announce the launch of our latest state-of-the-art (SOTA) object detection model for 2023 - YOLOv8 πŸš€!

Designed to be fast, accurate, and easy to use, YOLOv8 is an ideal choice for a wide range of object detection, image segmentation and image classification tasks. With YOLOv8, you'll be able to quickly and accurately detect objects in real-time, streamline your workflows, and achieve new levels of accuracy in your projects.

Check out our YOLOv8 Docs for details and get started with:

pip install ultralytics

from yolov5.

Ashwini4869 avatar Ashwini4869 commented on September 9, 2024

Well, it didn't work.

  1. There is no class named YOLOv5 in module yolov5.
  2. I tried to do the same after loading model from torch hub. But it had no loss attribute for DetectMultiBackend class.

Can you please suggest a working solution?

from yolov5.

Ashwini4869 avatar Ashwini4869 commented on September 9, 2024

I loaded the model using:
model = torch.hub.load('ultralytics/yolov5','yolov5s',autoshape=False, pretrained=True)

However, when I try to define loss function from torch.utils.loss as:
compute_loss = ComputeLoss(model) # YOLOv5's custom loss function
It throws the following error: ```
`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[32], line 2
1 optimizer = optim.Adam(model.parameters(), lr = 0.001)
----> 2 compute_loss = ComputeLoss(model) # YOLOv5's custom loss function

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:97, in ComputeLoss.init(self, model, autobalance)
95 def init(self, model, autobalance=False):
96 device = next(model.parameters()).device # get model device
---> 97 h = model.hyp # hyperparameters
99 # Define criteria
100 BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1709, in Module.getattr(self, name)
1707 if name in modules:
1708 return modules[name]
-> 1709 raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")

AttributeError: 'DetectMultiBackend' object has no attribute 'hyp'```

What should I do to resolve this?

from yolov5.

glenn-jocher avatar glenn-jocher commented on September 9, 2024

Hello @Ashwini4869,

Thank you for providing the detailed error message. It looks like the issue arises because the DetectMultiBackend object does not have the hyp attribute, which is required by the ComputeLoss function.

To resolve this, you can manually set the hyperparameters (hyp) for the model. Here's how you can do it:

Step-by-Step Solution

  1. Load the Model:
    Ensure you load the model correctly using PyTorch Hub.

    import torch
    model = torch.hub.load('ultralytics/yolov5', 'yolov5s', autoshape=False, pretrained=True)
  2. Set Hyperparameters:
    Manually set the hyperparameters for the model. You can use the default hyperparameters from the YOLOv5 repository.

    from yolov5.utils.general import check_yaml
    from yolov5.utils.loss import ComputeLoss
    
    # Load default hyperparameters
    hyp = check_yaml('data/hyp.scratch.yaml')  # or use your custom hyperparameters file
    
    # Manually set the hyperparameters to the model
    model.hyp = hyp
  3. Define the Loss Function:
    Now you can define the loss function using the ComputeLoss class.

    compute_loss = ComputeLoss(model)  # YOLOv5's custom loss function
  4. Training Loop:
    Implement the training loop to compute loss and optimize the model.

    import torch.optim as optim
    from torch.utils.data import DataLoader
    
    # Define optimizer
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    
    # Custom dataset and dataloader (assuming you have already defined these)
    dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
    dataloader = DataLoader(dataset, batch_size=16, shuffle=True)
    
    # Training loop
    for epoch in range(num_epochs):
        model.train()
        for images, labels in dataloader:
            optimizer.zero_grad()
            outputs = model(images)
            loss, _ = compute_loss(outputs, labels)
            loss.backward()
            optimizer.step()
            
            print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Tips

  • Transforms: Ensure you apply the necessary transformations to your images to match the input requirements of YOLOv5.
  • Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
  • Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

If you encounter any further issues, please ensure you are using the latest version of the YOLOv5 repository and dependencies. You can update your local repository and dependencies by running:

git pull
pip install -r requirements.txt

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! πŸš€

from yolov5.

Ashwini4869 avatar Ashwini4869 commented on September 9, 2024

Setting model.hyp gives another error as it is assigned as a string. So, I set model.hyp by making it into a dictionary.
That issue is resolved, however, I run into another error.

TypeError                                 Traceback (most recent call last)
Cell In[38], [line 2](vscode-notebook-cell:?execution_count=38&line=2)
      [1](vscode-notebook-cell:?execution_count=38&line=1) optimizer = optim.Adam(model.parameters(), lr = 0.001)
----> [2](vscode-notebook-cell:?execution_count=38&line=2) compute_loss = ComputeLoss(model) # YOLOv5's custom loss function

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:111, in ComputeLoss.__init__(self, model, autobalance)
    [108](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:108) if g > 0:
    [109](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:109)     BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
--> [111](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:111) m = de_parallel(model).model[-1]  # Detect() module
    [112](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:112) self.balance = {3: [4.0, 1.0, 0.4]}.get(m.nl, [4.0, 1.0, 0.25, 0.06, 0.02])  # P3-P7
    [113](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:113) self.ssi = list(m.stride).index(16) if autobalance else 0  # stride 16 index

TypeError: 'DetectionModel' object is not subscriptable

Can you please give me a solution? I think you people should make an easy understandable guide to train via custom dataloader. Many people are looking for the same solution. Thank you!

from yolov5.

glenn-jocher avatar glenn-jocher commented on September 9, 2024

Hello @Ashwini4869,

Thank you for your patience and for providing detailed information about the issues you're encountering. I understand the challenges you're facing, and I'm here to help you resolve them.

Addressing the Error

The error you're seeing, TypeError: 'DetectionModel' object is not subscriptable, indicates that the model object is not being accessed correctly in the ComputeLoss function. This typically happens when the model structure is not as expected by the loss function.

To resolve this, let's ensure that the model's structure is correctly set up for the loss function. Here's a revised approach:

  1. Load the Model:
    Ensure you load the model correctly using PyTorch Hub.

    import torch
    model = torch.hub.load('ultralytics/yolov5', 'yolov5s', autoshape=False, pretrained=True)
  2. Set Hyperparameters:
    Manually set the hyperparameters for the model. You can use the default hyperparameters from the YOLOv5 repository.

    from yolov5.utils.general import check_yaml
    from yolov5.utils.loss import ComputeLoss
    
    # Load default hyperparameters
    hyp = check_yaml('data/hyp.scratch.yaml')  # or use your custom hyperparameters file
    
    # Manually set the hyperparameters to the model
    model.hyp = hyp
  3. Ensure Model Structure:
    Make sure the model's structure is compatible with the ComputeLoss function.

    from yolov5.models.yolo import Model
    
    # Ensure the model is of type Model
    if not isinstance(model, Model):
        model = Model(cfg='models/yolov5s.yaml', ch=3, nc=80).to(device)
        model.load_state_dict(torch.load('yolov5s.pt')['model'])
        model.hyp = hyp
  4. Define the Loss Function:
    Now you can define the loss function using the ComputeLoss class.

    compute_loss = ComputeLoss(model)  # YOLOv5's custom loss function
  5. Training Loop:
    Implement the training loop to compute loss and optimize the model.

    import torch.optim as optim
    from torch.utils.data import DataLoader
    
    # Define optimizer
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    
    # Custom dataset and dataloader (assuming you have already defined these)
    dataset = CustomDataset(img_dir='path/to/images', label_dir='path/to/labels')
    dataloader = DataLoader(dataset, batch_size=16, shuffle=True)
    
    # Training loop
    for epoch in range(num_epochs):
        model.train()
        for images, labels in dataloader:
            optimizer.zero_grad()
            outputs = model(images)
            loss, _ = compute_loss(outputs, labels)
            loss.backward()
            optimizer.step()
            
            print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

Additional Resources

I understand the need for a more comprehensive guide on training with a custom dataloader. The YOLOv5 community and the Ultralytics team are continuously working on improving documentation and resources. In the meantime, you can refer to the YOLOv5 documentation and the Ultralytics GitHub Discussions for more insights and community support.

If you encounter any further issues, please ensure you are using the latest version of the YOLOv5 repository and dependencies. You can update your local repository and dependencies by running:

git pull
pip install -r requirements.txt

Thank you for your understanding and patience. If you have any further questions, feel free to ask. Happy fine-tuning! πŸš€

from yolov5.

Ashwini4869 avatar Ashwini4869 commented on September 9, 2024

When running this code:

if not isinstance(model,Model):
    model = Model(cfg='yolov5/models/yolov5s.yaml',ch=3,nc=80).to('cuda')
    model.load_state_dict(torch.load('yolov5s.pt')['model'])
    model.hyp = hyp

I get the following error:

TypeError                                 Traceback (most recent call last)
Cell In[14], [line 3](vscode-notebook-cell:?execution_count=14&line=3)
      [1](vscode-notebook-cell:?execution_count=14&line=1) if not isinstance(model,Model):
      [2](vscode-notebook-cell:?execution_count=14&line=2)     model = Model(cfg='yolov5/models/yolov5s.yaml',ch=3,nc=80).to('cuda')
----> [3](vscode-notebook-cell:?execution_count=14&line=3)     model.load_state_dict(torch.load('yolov5s.pt')['model'])
      [4](vscode-notebook-cell:?execution_count=14&line=4)     model.hyp = hyp

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2140, in Module.load_state_dict(self, state_dict, strict, assign)
   [2105](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2105) r"""Copy parameters and buffers from :attr:`state_dict` into this module and its descendants.
   [2106](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2106) 
   [2107](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2107) If :attr:`strict` is ``True``, then
   (...)
   [2137](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2137)     ``RuntimeError``.
   [2138](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2138) """
   [2139](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2139) if not isinstance(state_dict, Mapping):
-> [2140](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2140)     raise TypeError(f"Expected state_dict to be dict-like, got {type(state_dict)}.")
   [2142](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2142) missing_keys: List[str] = []
   [2143](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:2143) unexpected_keys: List[str] = []

TypeError: Expected state_dict to be dict-like, got <class 'models.yolo.DetectionModel'>.

Can you please test the solution yourself before suggesting it?

from yolov5.

Ashwini4869 avatar Ashwini4869 commented on September 9, 2024
import torch
from torchvision import transforms as transforms
from torch.utils.data import Dataset,DataLoader
import torchvision
from torch.nn.utils.rnn import pad_sequence
from PIL import Image
import os
import matplotlib.pyplot as plt
import torch.optim as optim
# yolov5 imports
from yolov5.utils.loss import ComputeLoss
from yolov5.utils.general import check_yaml
from yolov5.models.yolo import Model
import numpy as np
%matplotlib inline
Custom Dataset Class
class WIDERFaceDataset(Dataset):
    def __init__(self,img_dir, annotations_file,transform=None):
        self.img_dir = img_dir
        self.annotations = self.parse_annotations(annotations_file)
        self.transform = transform
        
    def parse_annotations(self, annotations_file):
        annotations=[]
        with open(annotations_file,'r') as f:
            lines = f.readlines()
            i = 0
            while i < len(lines):
                img_path = lines[i].strip()
                num_boxes = int(lines[i+1].strip())
                boxes = []
                for j in range(max(num_boxes,1)):
                    box = list(map(int, lines[i+2+j].strip().split()))
                    image = Image.open(self.img_dir + img_path)
                    img_width, img_height = image.size
                    x_min, y_min, width, height = box[:4]
                    label = self.convert_to_yolo_label(x_min,y_min, width, height, img_width, img_height)
                    boxes.append(label)
                annotations.append({'image':img_path,'labels':torch.tensor(boxes,dtype=torch.float32)})
                # annotations.append({'image':img_path,'labels':boxes})
                i +=2 + max(num_boxes,1)
        return annotations
    
    def convert_to_yolo_label(self, x_min, y_min, width, height, img_width, img_height):
        norm_center_x= (x_min + width/2)/img_width
        norm_center_y = (y_min + height/2)/img_height
        norm_width = width/img_width
        norm_height = height/img_height
        label=[0,norm_center_x,norm_center_y,norm_width,norm_height]
        return label 
    
    def __len__(self):
        return len(self.annotations)
    
    def __getitem__(self,idx):
        img_path = os.path.join(self.img_dir, self.annotations[idx]['image'])
        labels = self.annotations[idx]['labels']
        
        image = Image.open(img_path).convert('RGB')
        target = torch.zeros((len(labels), 6))
        target[:, 1:] = labels
         
        return self.transform(image), target
Creating Dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Resize((640,640))
])
train_dataset = WIDERFaceDataset(img_dir=r'../Dataset/WIDER_train/images/', annotations_file=r'../Dataset/WIDER_train/wider_face_train_bbx_gt.txt',transform=transform)

valid_dataset = WIDERFaceDataset(img_dir=r"../Dataset/WIDER_val/images/", annotations_file=r'../Dataset/WIDER_val/wider_face_val_bbx_gt.txt',transform=transform)
train_dataset[2][1]
Creating Dataloader
def collate_fn(batch):
    images, labels = zip(*batch)
    
    images = torch.stack(images, 0)

    max_num_boxes = max(label.size(0) for label in labels)
    padded_labels = []
    for label in labels:
        num_boxes = label.size(0)
        padding = (0, 0, 0, max_num_boxes-num_boxes)
        padded_labels.append(torch.nn.functional.pad(label,padding))
    padded_labels = torch.stack(padded_labels, 0 )

    return images, padded_labels
train_dataloader = DataLoader(train_dataset, batch_size=16, shuffle=True, collate_fn= collate_fn)
valid_dataloader = DataLoader(valid_dataset, batch_size=32, shuffle=True, collate_fn= collate_fn)
Loading model
def show_batch(sample_batched):
    images_batch, labels_batch = sample_batched
    batch_size = len(images_batch)
    im_size = images_batch.size(2)
    
    grid = torchvision.utils.make_grid(images_batch)
    grid = grid.clip(0,1)
    plt.imshow(grid.numpy().transpose((1,2,0)))
    plt.title("Batch from dataloader")    
for i, sample_batched in enumerate(train_dataloader):
    print(f"Batch {i + 1}")
    images_batch, labels_batch = sample_batched
    print("Image batch shape:", images_batch.size())
    print("Labels batch shape:", [label.size() for label in labels_batch])
    
    if i == 0:  # Display the first batch
        show_batch(sample_batched)
        plt.axis('off')
        plt.show()
    break
Loading model
model = torch.load("yolov5s.pt",map_location='cuda')
model = model['model']
compute_loss = ComputeLoss(model) # YOLOv5's custom loss function
optimizer = optim.Adam(model.parameters(), lr = 0.001)
Hyperparameters
num_epochs = 1
Training/Validation Loop
for epoch in range(num_epochs):
    # Training 
    training_loss = 0.0
    model.train()
    for images, targets in train_dataloader:
        images = images.to('cuda')
        images = images.half()
        targets = targets.to('cuda')
        optimizer.zero_grad()
        outputs = model(images)
        print(outputs[0])
        print(targets)
        loss,_ = compute_loss(outputs, targets)
        loss.backward()
        optimizer.step()
        training_loss +=loss.item() * images.size(0)
    
    training_loss /= len(train_dataloader)
    print(f"Epoch [{epoch+1}/{num_epochs}, Loss: {training_loss:.4f}]")
    

Error:

[11](vscode-notebook-cell:?execution_count=165&line=11) print(outputs[0])
     [12](vscode-notebook-cell:?execution_count=165&line=12) print(targets)
---> [13](vscode-notebook-cell:?execution_count=165&line=13) loss,_ = compute_loss(outputs, targets)
     [14](vscode-notebook-cell:?execution_count=165&line=14) loss.backward()
     [15](vscode-notebook-cell:?execution_count=165&line=15) optimizer.step()

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:125, in ComputeLoss.__call__(self, p, targets)
    [123](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:123) lbox = torch.zeros(1, device=self.device)  # box loss
    [124](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:124) lobj = torch.zeros(1, device=self.device)  # object loss
--> [125](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:125) tcls, tbox, indices, anchors = self.build_targets(p, targets)  # targets
    [127](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:127) # Losses
    [128](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:128) for i, pi in enumerate(p):  # layer index, layer predictions

File /mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:183, in ComputeLoss.build_targets(self, p, targets)
    [181](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:181) gain = torch.ones(7, device=self.device)  # normalized to gridspace gain
    [182](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:182) ai = torch.arange(na, device=self.device).float().view(na, 1).repeat(1, nt)  # same as .repeat_interleave(nt)
--> [183](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:183) targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None]), 2)  # append anchor indices
    [185](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:185) g = 0.5  # bias
    [186](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:186) off = torch.tensor(
    [187](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:187)     [
    [188](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:188)         [0, 0],
   (...)
    [194](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:194)     ],
    [195](https://vscode-remote+ssh-002dremote-002b10-002e10-002e5-002e17.vscode-resource.vscode-cdn.net/mnt/nvme2/ashwini/PersonalTasks/object_detection_with_yolo/.venv/lib/python3.10/site-packages/yolov5/utils/loss.py:195)     device=self.device).float() * g  # offsets

RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 48 but got size 3 for tensor number 1 in the list.

Where is the error now?
@glenn-jocher

from yolov5.

glenn-jocher avatar glenn-jocher commented on September 9, 2024

Hello @Ashwini4869,

Thank you for sharing the detailed code and error message. It looks like the issue is related to the shape of the targets tensor when passed to the ComputeLoss function. The error message indicates a mismatch in tensor sizes, which usually happens when the target labels are not in the expected format.

Addressing the Error

The ComputeLoss function expects the targets to be in a specific format. Let's ensure that the targets are correctly formatted before passing them to the loss function.

Step-by-Step Solution

  1. Ensure Target Format:
    The targets should be in the format [batch_size, num_boxes, 6], where each box is represented by [class, x_center, y_center, width, height, anchor_index].

  2. Modify the Dataset Class:
    Ensure that the targets are correctly formatted in the __getitem__ method of your custom dataset class.

    class WIDERFaceDataset(Dataset):
        def __init__(self, img_dir, annotations_file, transform=None):
            self.img_dir = img_dir
            self.annotations = self.parse_annotations(annotations_file)
            self.transform = transform
    
        def parse_annotations(self, annotations_file):
            annotations = []
            with open(annotations_file, 'r') as f:
                lines = f.readlines()
                i = 0
                while i < len(lines):
                    img_path = lines[i].strip()
                    num_boxes = int(lines[i + 1].strip())
                    boxes = []
                    for j in range(max(num_boxes, 1)):
                        box = list(map(int, lines[i + 2 + j].strip().split()))
                        image = Image.open(self.img_dir + img_path)
                        img_width, img_height = image.size
                        x_min, y_min, width, height = box[:4]
                        label = self.convert_to_yolo_label(x_min, y_min, width, height, img_width, img_height)
                        boxes.append(label)
                    annotations.append({'image': img_path, 'labels': torch.tensor(boxes, dtype=torch.float32)})
                    i += 2 + max(num_boxes, 1)
            return annotations
    
        def convert_to_yolo_label(self, x_min, y_min, width, height, img_width, img_height):
            norm_center_x = (x_min + width / 2) / img_width
            norm_center_y = (y_min + height / 2) / img_height
            norm_width = width / img_width
            norm_height = height / img_height
            label = [0, norm_center_x, norm_center_y, norm_width, norm_height]
            return label
    
        def __len__(self):
            return len(self.annotations)
    
        def __getitem__(self, idx):
            img_path = os.path.join(self.img_dir, self.annotations[idx]['image'])
            labels = self.annotations[idx]['labels']
    
            image = Image.open(img_path).convert('RGB')
            if self.transform:
                image = self.transform(image)
    
            target = torch.zeros((len(labels), 6))
            target[:, 1:] = labels
    
            return image, target
  3. Modify the Collate Function:
    Ensure that the collate function correctly pads the targets.

    def collate_fn(batch):
        images, labels = zip(*batch)
    
        images = torch.stack(images, 0)
    
        max_num_boxes = max(label.size(0) for label in labels)
        padded_labels = []
        for label in labels:
            num_boxes = label.size(0)
            padding = (0, 0, 0, max_num_boxes - num_boxes)
            padded_labels.append(torch.nn.functional.pad(label, padding))
        padded_labels = torch.stack(padded_labels, 0)
    
        return images, padded_labels
  4. Training Loop:
    Ensure the training loop correctly processes the images and targets.

    import torch.optim as optim
    from torch.utils.data import DataLoader
    
    # Define optimizer
    optimizer = optim.Adam(model.parameters(), lr=0.001)
    
    # Custom dataset and dataloader
    train_dataset = WIDERFaceDataset(img_dir='path/to/images', annotations_file='path/to/annotations', transform=transform)
    train_dataloader = DataLoader(train_dataset, batch_size=16, shuffle=True, collate_fn=collate_fn)
    
    # Training loop
    for epoch in range(num_epochs):
        model.train()
        training_loss = 0.0
        for images, targets in train_dataloader:
            images = images.to('cuda')
            targets = targets.to('cuda')
            optimizer.zero_grad()
            outputs = model(images)
            loss, _ = compute_loss(outputs, targets)
            loss.backward()
            optimizer.step()
            training_loss += loss.item() * images.size(0)
    
        training_loss /= len(train_dataloader.dataset)
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {training_loss:.4f}")

Additional Tips

  • Transforms: Ensure you apply the necessary transformations to your images to match the input requirements of YOLOv5.
  • Validation: Implement a validation loop to monitor the model's performance on a separate validation set.
  • Hyperparameters: Experiment with different hyperparameters (learning rate, batch size, etc.) to optimize training.

If you encounter any further issues, please ensure you are using the latest version of the YOLOv5 repository and dependencies. You can update your local repository and dependencies by running:

git pull
pip install -r requirements.txt

I hope this helps! If you have any further questions, feel free to ask. Happy fine-tuning! πŸš€

from yolov5.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.