Coder Social home page Coder Social logo

ultralytics / yolov3 Goto Github PK

View Code? Open in Web Editor NEW
10.0K 156.0 3.4K 9.93 MB

YOLOv3 in PyTorch > ONNX > CoreML > TFLite

Home Page: https://docs.ultralytics.com

License: GNU Affero General Public License v3.0

Shell 0.77% Python 76.52% Dockerfile 0.41% Jupyter Notebook 22.29%
yolov3 object-detection yolo yolov5 deep-learning machine-learning ultralytics

yolov3's Introduction

中文 | 한국어 | 日本語 | Русский | Deutsch | Français | Español | Português | हिन्दी | العربية

YOLOv3 CI YOLOv3 Citation Docker Pulls Discord
Run on Gradient Open In Colab Open In Kaggle

YOLOv3 🚀 is the world's most loved vision AI, representing Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.

We hope that the resources here will help you get the most out of YOLOv3. Please browse the YOLOv3 Docs for details, raise an issue on GitHub for support, and join our Discord community for questions and discussions!

To request an Enterprise License please complete the form at Ultralytics Licensing.

Ultralytics GitHub Ultralytics LinkedIn Ultralytics Twitter Ultralytics YouTube Ultralytics TikTok Ultralytics Instagram Ultralytics Discord

YOLOv8 🚀 NEW

We are thrilled to announce the launch of Ultralytics YOLOv8 🚀, our NEW cutting-edge, state-of-the-art (SOTA) model released at https://github.com/ultralytics/ultralytics. YOLOv8 is designed to be fast, accurate, and easy to use, making it an excellent choice for a wide range of object detection, image segmentation and image classification tasks.

See the YOLOv8 Docs for details and get started with:

PyPI version Downloads

pip install ultralytics

Documentation

See the YOLOv3 Docs for full documentation on training, testing and deployment. See below for quickstart examples.

Install

Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch>=1.7.

git clone https://github.com/ultralytics/yolov3  # clone
cd yolov3
pip install -r requirements.txt  # install
Inference

YOLOv3 PyTorch Hub inference. Models download automatically from the latest YOLOv3 release.

import torch

# Model
model = torch.hub.load("ultralytics/yolov3", "yolov3")  # or yolov5n - yolov5x6, custom

# Images
img = "https://ultralytics.com/images/zidane.jpg"  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.
Inference with detect.py

detect.py runs inference on a variety of sources, downloading models automatically from the latest YOLOv3 release and saving results to runs/detect.

python detect.py --weights yolov5s.pt --source 0                               # webcam
                                               img.jpg                         # image
                                               vid.mp4                         # video
                                               screen                          # screenshot
                                               path/                           # directory
                                               list.txt                        # list of images
                                               list.streams                    # list of streams
                                               'path/*.jpg'                    # glob
                                               'https://youtu.be/LNwODJXcvt4'  # YouTube
                                               'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream
Training

The commands below reproduce YOLOv3 COCO results. Models and datasets download automatically from the latest YOLOv3 release. Training times for YOLOv5n/s/m/l/x are 1/2/4/6/8 days on a V100 GPU (Multi-GPU times faster). Use the largest --batch-size possible, or pass --batch-size -1 for YOLOv3 AutoBatch. Batch sizes shown for V100-16GB.

python train.py --data coco.yaml --epochs 300 --weights '' --cfg yolov5n.yaml  --batch-size 128
                                                                 yolov5s                    64
                                                                 yolov5m                    40
                                                                 yolov5l                    24
                                                                 yolov5x                    16
Tutorials

Integrations




Roboflow ClearML ⭐ NEW Comet ⭐ NEW Neural Magic ⭐ NEW
Label and export your custom datasets directly to YOLOv3 for training with Roboflow Automatically track, visualize and even remotely train YOLOv3 using ClearML (open-source!) Free forever, Comet lets you save YOLOv3 models, resume training, and interactively visualise and debug predictions Run YOLOv3 inference up to 6x faster with Neural Magic DeepSparse

Ultralytics HUB

Experience seamless AI with Ultralytics HUB ⭐, the all-in-one solution for data visualization, YOLO 🚀 model training and deployment, without any coding. Transform images into actionable insights and bring your AI visions to life with ease using our cutting-edge platform and user-friendly Ultralytics App. Start your journey for Free now!

Why YOLOv3

YOLOv3 has been designed to be super easy to get started and simple to learn. We prioritize real-world results.

YOLOv3-P5 640 Figure

Figure Notes
  • COCO AP val denotes [email protected]:0.95 metric measured on the 5000-image COCO val2017 dataset over various inference sizes from 256 to 1536.
  • GPU Speed measures average inference time per image on COCO val2017 dataset using a AWS p3.2xlarge V100 instance at batch-size 32.
  • EfficientDet data from google/automl at batch size 8.
  • Reproduce by python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n6.pt yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt

Pretrained Checkpoints

Model size
(pixels)
mAPval
50-95
mAPval
50
Speed
CPU b1
(ms)
Speed
V100 b1
(ms)
Speed
V100 b32
(ms)
params
(M)
FLOPs
@640 (B)
YOLOv5n 640 28.0 45.7 45 6.3 0.6 1.9 4.5
YOLOv5s 640 37.4 56.8 98 6.4 0.9 7.2 16.5
YOLOv5m 640 45.4 64.1 224 8.2 1.7 21.2 49.0
YOLOv5l 640 49.0 67.3 430 10.1 2.7 46.5 109.1
YOLOv5x 640 50.7 68.9 766 12.1 4.8 86.7 205.7
YOLOv5n6 1280 36.0 54.4 153 8.1 2.1 3.2 4.6
YOLOv5s6 1280 44.8 63.7 385 8.2 3.6 12.6 16.8
YOLOv5m6 1280 51.3 69.3 887 11.1 6.8 35.7 50.0
YOLOv5l6 1280 53.7 71.3 1784 15.8 10.5 76.8 111.4
YOLOv5x6
+ TTA
1280
1536
55.0
55.8
72.7
72.7
3136
-
26.2
-
19.4
-
140.7
-
209.8
-
Table Notes
  • All checkpoints are trained to 300 epochs with default settings. Nano and Small models use hyp.scratch-low.yaml hyps, all others use hyp.scratch-high.yaml.
  • mAPval values are for single-model single-scale on COCO val2017 dataset.
    Reproduce by python val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
  • Speed averaged over COCO val images using a AWS p3.2xlarge instance. NMS times (~1 ms/img) not included.
    Reproduce by python val.py --data coco.yaml --img 640 --task speed --batch 1
  • TTA Test Time Augmentation includes reflection and scale augmentations.
    Reproduce by python val.py --data coco.yaml --img 1536 --iou 0.7 --augment

Segmentation

Our new YOLOv5 release v7.0 instance segmentation models are the fastest and most accurate in the world, beating all current SOTA benchmarks. We've made them super simple to train, validate and deploy. See full details in our Release Notes and visit our YOLOv5 Segmentation Colab Notebook for quickstart tutorials.

Segmentation Checkpoints

We trained YOLOv5 segmentations models on COCO for 300 epochs at image size 640 using A100 GPUs. We exported all models to ONNX FP32 for CPU speed tests and to TensorRT FP16 for GPU speed tests. We ran all speed tests on Google Colab Pro notebooks for easy reproducibility.

Model size
(pixels)
mAPbox
50-95
mAPmask
50-95
Train time
300 epochs
A100 (hours)
Speed
ONNX CPU
(ms)
Speed
TRT A100
(ms)
params
(M)
FLOPs
@640 (B)
YOLOv5n-seg 640 27.6 23.4 80:17 62.7 1.2 2.0 7.1
YOLOv5s-seg 640 37.6 31.7 88:16 173.3 1.4 7.6 26.4
YOLOv5m-seg 640 45.0 37.1 108:36 427.0 2.2 22.0 70.8
YOLOv5l-seg 640 49.0 39.9 66:43 (2x) 857.4 2.9 47.9 147.7
YOLOv5x-seg 640 50.7 41.4 62:56 (3x) 1579.2 4.5 88.8 265.7
  • All checkpoints are trained to 300 epochs with SGD optimizer with lr0=0.01 and weight_decay=5e-5 at image size 640 and all default settings.
    Runs logged to https://wandb.ai/glenn-jocher/YOLOv5_v70_official
  • Accuracy values are for single-model single-scale on COCO dataset.
    Reproduce by python segment/val.py --data coco.yaml --weights yolov5s-seg.pt
  • Speed averaged over 100 inference images using a Colab Pro A100 High-RAM instance. Values indicate inference speed only (NMS adds about 1ms per image).
    Reproduce by python segment/val.py --data coco.yaml --weights yolov5s-seg.pt --batch 1
  • Export to ONNX at FP32 and TensorRT at FP16 done with export.py.
    Reproduce by python export.py --weights yolov5s-seg.pt --include engine --device 0 --half
Segmentation Usage Examples  Open In Colab

Train

YOLOv5 segmentation training supports auto-download COCO128-seg segmentation dataset with --data coco128-seg.yaml argument and manual download of COCO-segments dataset with bash data/scripts/get_coco.sh --train --val --segments and then python train.py --data coco.yaml.

# Single-GPU
python segment/train.py --data coco128-seg.yaml --weights yolov5s-seg.pt --img 640

# Multi-GPU DDP
python -m torch.distributed.run --nproc_per_node 4 --master_port 1 segment/train.py --data coco128-seg.yaml --weights yolov5s-seg.pt --img 640 --device 0,1,2,3

Val

Validate YOLOv5s-seg mask mAP on COCO dataset:

bash data/scripts/get_coco.sh --val --segments  # download COCO val segments split (780MB, 5000 images)
python segment/val.py --weights yolov5s-seg.pt --data coco.yaml --img 640  # validate

Predict

Use pretrained YOLOv5m-seg.pt to predict bus.jpg:

python segment/predict.py --weights yolov5m-seg.pt --data data/images/bus.jpg
model = torch.hub.load(
    "ultralytics/yolov5", "custom", "yolov5m-seg.pt"
)  # load from PyTorch Hub (WARNING: inference not yet supported)
zidane bus

Export

Export YOLOv5s-seg model to ONNX and TensorRT:

python export.py --weights yolov5s-seg.pt --include onnx engine --img 640 --device 0

Classification

YOLOv5 release v6.2 brings support for classification model training, validation and deployment! See full details in our Release Notes and visit our YOLOv5 Classification Colab Notebook for quickstart tutorials.

Classification Checkpoints

We trained YOLOv5-cls classification models on ImageNet for 90 epochs using a 4xA100 instance, and we trained ResNet and EfficientNet models alongside with the same default training settings to compare. We exported all models to ONNX FP32 for CPU speed tests and to TensorRT FP16 for GPU speed tests. We ran all speed tests on Google Colab Pro for easy reproducibility.

Model size
(pixels)
acc
top1
acc
top5
Training
90 epochs
4xA100 (hours)
Speed
ONNX CPU
(ms)
Speed
TensorRT V100
(ms)
params
(M)
FLOPs
@224 (B)
YOLOv5n-cls 224 64.6 85.4 7:59 3.3 0.5 2.5 0.5
YOLOv5s-cls 224 71.5 90.2 8:09 6.6 0.6 5.4 1.4
YOLOv5m-cls 224 75.9 92.9 10:06 15.5 0.9 12.9 3.9
YOLOv5l-cls 224 78.0 94.0 11:56 26.9 1.4 26.5 8.5
YOLOv5x-cls 224 79.0 94.4 15:04 54.3 1.8 48.1 15.9
ResNet18 224 70.3 89.5 6:47 11.2 0.5 11.7 3.7
ResNet34 224 73.9 91.8 8:33 20.6 0.9 21.8 7.4
ResNet50 224 76.8 93.4 11:10 23.4 1.0 25.6 8.5
ResNet101 224 78.5 94.3 17:10 42.1 1.9 44.5 15.9
EfficientNet_b0 224 75.1 92.4 13:03 12.5 1.3 5.3 1.0
EfficientNet_b1 224 76.4 93.2 17:04 14.9 1.6 7.8 1.5
EfficientNet_b2 224 76.6 93.4 17:10 15.9 1.6 9.1 1.7
EfficientNet_b3 224 77.7 94.0 19:19 18.9 1.9 12.2 2.4
Table Notes (click to expand)
  • All checkpoints are trained to 90 epochs with SGD optimizer with lr0=0.001 and weight_decay=5e-5 at image size 224 and all default settings.
    Runs logged to https://wandb.ai/glenn-jocher/YOLOv5-Classifier-v6-2
  • Accuracy values are for single-model single-scale on ImageNet-1k dataset.
    Reproduce by python classify/val.py --data ../datasets/imagenet --img 224
  • Speed averaged over 100 inference images using a Google Colab Pro V100 High-RAM instance.
    Reproduce by python classify/val.py --data ../datasets/imagenet --img 224 --batch 1
  • Export to ONNX at FP32 and TensorRT at FP16 done with export.py.
    Reproduce by python export.py --weights yolov5s-cls.pt --include engine onnx --imgsz 224
Classification Usage Examples  Open In Colab

Train

YOLOv5 classification training supports auto-download of MNIST, Fashion-MNIST, CIFAR10, CIFAR100, Imagenette, Imagewoof, and ImageNet datasets with the --data argument. To start training on MNIST for example use --data mnist.

# Single-GPU
python classify/train.py --model yolov5s-cls.pt --data cifar100 --epochs 5 --img 224 --batch 128

# Multi-GPU DDP
python -m torch.distributed.run --nproc_per_node 4 --master_port 1 classify/train.py --model yolov5s-cls.pt --data imagenet --epochs 5 --img 224 --device 0,1,2,3

Val

Validate YOLOv5m-cls accuracy on ImageNet-1k dataset:

bash data/scripts/get_imagenet.sh --val  # download ImageNet val split (6.3G, 50000 images)
python classify/val.py --weights yolov5m-cls.pt --data ../datasets/imagenet --img 224  # validate

Predict

Use pretrained YOLOv5s-cls.pt to predict bus.jpg:

python classify/predict.py --weights yolov5s-cls.pt --data data/images/bus.jpg
model = torch.hub.load(
    "ultralytics/yolov5", "custom", "yolov5s-cls.pt"
)  # load from PyTorch Hub

Export

Export a group of trained YOLOv5s-cls, ResNet and EfficientNet models to ONNX and TensorRT:

python export.py --weights yolov5s-cls.pt resnet50.pt efficientnet_b0.pt --include onnx engine --img 224

Environments

Get started in seconds with our verified environments. Click each icon below for details.

Contribute

We love your input! We want to make contributing to YOLOv3 as easy and transparent as possible. Please see our Contributing Guide to get started, and fill out the YOLOv3 Survey to send us feedback on your experiences. Thank you to all our contributors!

License

Ultralytics offers two licensing options to accommodate diverse use cases:

  • AGPL-3.0 License: This OSI-approved open-source license is ideal for students and enthusiasts, promoting open collaboration and knowledge sharing. See the LICENSE file for more details.
  • Enterprise License: Designed for commercial use, this license permits seamless integration of Ultralytics software and AI models into commercial goods and services, bypassing the open-source requirements of AGPL-3.0. If your scenario involves embedding our solutions into a commercial offering, reach out through Ultralytics Licensing.

Contact

For YOLOv3 bug reports and feature requests please visit GitHub Issues, and join our Discord community for questions and discussions!


Ultralytics GitHub Ultralytics LinkedIn Ultralytics Twitter Ultralytics YouTube Ultralytics TikTok Ultralytics Instagram Ultralytics Discord

yolov3's People

Contributors

adrianboguszewski avatar changhsinlee avatar d-j-kendall avatar dependabot[bot] avatar developer0hye avatar dsuess avatar e96031413 avatar falaktheoptimist avatar fatihbaltaci avatar franciscoreveriano avatar gabrielbianconi avatar glenn-jocher avatar googlewiki avatar guigarfr avatar ilyaovodov avatar jas-nat avatar jrmh96 avatar jveitchmichaelis avatar lincoce avatar linzzzzzz avatar lukeai avatar nanocode012 avatar nirzarrabi avatar ownmarc avatar pderrenger avatar perry0418 avatar pre-commit-ci[bot] avatar roulbac avatar s-mohaghegh97 avatar ttayu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolov3's Issues

Training code

Is the following for loop necessary? Except for the last batch, len(imgs) = n, so j could only be 0 in the loop. In the last batch, if len(imgs) is smaller than n, int(len(imgs) / n) = 0, the loop is ignored. Otherwise, len(imgs) = n, so j could only be 0 in the loop.

yolov3/train.py

Lines 118 to 119 in 68de92f

n = opt.batch_size # number of pictures at a time
for j in range(int(len(imgs) / n)):

Checkpoint from PyTorch-trained model?

Thank you for this awesome repository.
Due to some changes in the training scheme v.s. original darknet code, I wonder if the provided PyTorch weights are converted from the original .weights file - or rather, results of a fresh training session in PyTorch.
If it's the former, it would be wonderful of the results of PyTorch training could be provided as well!

Thanks!

line 9~20 in train.py may not work

line 24 in train.py import test makes line 9~20 may not work.
And it outputs two namespaces, which are

Namespace(batch_report=False, batch_size=16, cfg='cfg/yolov2.cfg', data_config_path='cfg/coco.data', epochs=100, freeze_darknet53=False, img_size=416, optimizer='SGD', resume=False, var=0)

Namespace(batch_size=32, cfg='cfg/yolov3.cfg', class_path='data/coco.names', conf_thres=0.3, data_config_path='cfg/coco.data', img_size=416, iou_thres=0.5, n_cpu=0, nms_thres=0.45, weights_path='weights/yolov3.pt')

Maybe put line 7~19 in test.py inside if __name__ == '__main__': could be better.

Train erro

The following error occurred while I was training coco.

Traceback (most recent call last):
  File "/project/yolov3/train.py", line 202, in <module>
    main(opt)
  File "/project/yolov3/train.py", line 132, in main
    loss = model(imgs.to(device), targets, requestPrecision=True)
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 123, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 133, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 77, in parallel_apply
    raise output
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 53, in _worker
    output = module(*input, **kwargs)
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/project/yolov3/models.py", line 238, in forward
    x, *losses = module[0](x, targets, requestPrecision)
  File "/data_b/VirEnv/project/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/project/yolov3/models.py", line 156, in forward
    requestPrecision)
  File "/project/yolov3/utils/utils.py", line 278, in build_targets
    tmp = pred_cls[b, a, gj, gi]
IndexError: index 8 is out of bounds for dimension 0 with size 8

Nothing was detected

When I load the trained model and run the detected.py,I found no bounding boxes was detected,I was very confused,anyone who can give me some suggestions to solve the problem,thanks.

rloss['nT'] is zero when training

when training:

Traceback (most recent call last):
File "train.py", line 211, in
main(opt)
File "train.py", line 177, in main
loss_per_target = rloss['loss'] / rloss['nT']
ZeroDivisionError: float division by zero

Darknet Training Comparison

All, I've started training using the official darknet repo to compare. The first two things I noticed are:

  1. Darknet training speed appears quite slow. In darknet yolov3.cfg, max_batches = 500200 is the total train time, and batch=64 is the images per batch, then this will take about 28 days on a GCP P100 at about 18,000 batches per day (all train settings to default).
  2. Darknet appears set to train for 267 epochs. This is 500200 batches times 64 images per batch divided by 120,000 images in the training set. Can this be right? This seems like a lot.
  3. Darknet is using multi_scale training, changing the image size every 10 batches. I've set this behavior as well in this repo if -multi_scale = True in train.py (though currently this changes the size every batch).

VOC mAP

Hi, glenn-jocher:
When I ran train.py on PASCAL VOC2007 (about 160 epoch), I got 87% recall and 85% precision, but when I ran test.py on the PASCAL VOC2007 test set, I only got 0.62. mAP, and I found that mAP can't grow around 100 epoch, staying around 0.62? How can I further improve mAP?

Loss Constants: _coord, _obj and _noobj

The correct YOLO v3 loss constants are:

lambda_coord = 1.0
lambda_obj = 1.0
lambda_noobj = 1.0

rather than the below constants, which derive from the original yolov1.cfg file:
https://github.com/pjreddie/darknet/blob/61c9d02ec461e30d55762ec7669d6a1d3c356fb2/cfg/yolov1.cfg#L257-L260

lambda_coord = 5.0
lambda_obj = 1.0
lambda_noobj = 0.5

The latest yolov3 constants appear to be hard coded into the parser.c rather than in yolov3.cfg Credit to @ydixon who originally noticed this discrepancy in issue #12.
https://github.com/pjreddie/darknet/blob/680d3bde1924c8ee2d1c1dea54d3e56a05ca9a26/src/parser.c#L376-L381

RuntimeError: invalid argument 2: size '[16 x 3 x 6 x 10 x 10]' is invalid for input with 408000 elements at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/TH/THStorage.cpp:84

Dear @glenn-jocher,
I am facing a issue when training my dataset with your code.
Dataset: 1 class
pytorch: 0.4.1
Ubuntu 16.04
1 GPU
+++++++++++++++++++++++++++++++++++++++++++++++
The logs are:
Traceback (most recent call last):
File "train.py", line 198, in
main(opt)
File "train.py", line 132, in main
loss = model(imgs.to(device), targets, requestPrecision=True)
File "/home/khanhdd/anaconda3/envs/actiondetectionYOLO/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/khanhdd/KhanhWorkSpace/realtime-action-detection/YOLOv3_Training/models.py", line 237, in forward
x, *losses = module[0](x, targets, requestPrecision)
File "/home/khanhdd/anaconda3/envs/actiondetectionYOLO/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/khanhdd/KhanhWorkSpace/realtime-action-detection/YOLOv3_Training/models.py", line 117, in forward
p = p.view(bs, self.nA, self.bbox_attrs, nG, nG).permute(0, 1, 3, 4, 2).contiguous() # prediction
RuntimeError: invalid argument 2: size '[16 x 3 x 6 x 10 x 10]' is invalid for input with 408000 elements at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/TH/THStorage.cpp:84
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Please give me your advise about this problem.
Thank you,
Khanh

PyCharm Printing Numpy Arrays (IndexError: tuple index out of range)

when i print (labels) in datasets.py ,row 143, there are a problem:
i can not print (lables), but i can print (labels[0][0]), print(labels0.shape)

Load labels

        if os.path.isfile(label_path):
            labels0 = np.loadtxt(label_path, dtype=np.float32).reshape(-1, 5)
            print(labels0.shape)    #143
            print(labels0[0][0])      #144
            print(labels)                #145
            exit()

#############

Traceback (most recent call last):
File "train.py", line 193, in
main(opt)
File "train.py", line 116, in main
for i, (imgs, targets) in enumerate(dataloader):
File "/home/chenfei/Downloads/yolov3-master1/utils/datasets.py", line 143, in next
print(labels)
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 1504, in array_str
return array2string(a, max_line_width, precision, suppress_small, ' ', "")
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 668, in array2string
return _array2string(a, options, separator, prefix)
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 460, in wrapper
return f(self, *args, **kwargs)
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 495, in _array2string
summary_insert, options['legacy'])
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 796, in _formatArray
curr_width=line_width)
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 750, in recurser
word = recurser(index + (-i,), next_hanging_indent, next_width)
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/arrayprint.py", line 704, in recurser
return format_function(a[index])
IndexError: tuple index out of range

I have a question about 'x1y1x2y2' in 'bbox_iou'

Thanks a lot for sharing your project.
I have a small question about function "bbox_iou" in utils/utils.py.
line 174 -------def bbox_iou(box1, box2, x1y1x2y2=True):
I find yolo-darknet53 model output is 'xywh' format, but here you set "x1y1x2y2=True",.
And in line 400-------ious = bbox_iou(max_detections[-1], detections_class[1:]), there is no formal parameter to change "x1y1x2y2" value.
I manual changed it from "True" to "False", but I found detect mAP declined
Could you tell me the reason?

Thank you very much!

IoU step in build_targets compared to Darknet implementation

Hi @glenn-jocher,

Thanks for this wonderful port of Yolo v3. I had two questions, however, about the matching step in build_targets -- where you compute which anchor box corresponds to each ground truth box.

You seem to be computing IoU using only the width and height of each anchor box against each target. Darknet doesn't appear to do this -- if I'm reading the implementation correctly, it iterates through the grid cells and computes the IoU using the X and Y of each cell. Is there a reason you compute IoU using width and height only? Optimization?

Also during this step, the ignore_threshold is set to 0.5 in the Darknet paper, and 0.7 in the Darknet implementation, while you seem to be using 0.1 in build_targets. Is there a reason for that?

Thanks!

different training results

Hi,
I started to train the yolov3 using 1 GPU without changing your code. And i got the below graphs...Which are all slightly different from your results. The shapes are roughly the same but the values are all in a different range shown below. I am a bit confused...It will be great if you could point me out the right direction thank you!

results

mean_mAP issue

I'm training VOC2007, afer 1 epoch, an error shows up:

F:\pytorch-yolov3-master-ul\test.py:122: RuntimeWarning: invalid value encountered in double_scalars
print('%15s: %-.4f' % (c, AP_accum[i] / AP_accum_count[i]))
aeroplane: nan
bicycle: nan
bird: nan
boat: nan
bottle: nan
bus: nan
car: nan
cat: nan
chair: nan
cow: nan
diningtable: nan
dog: nan
horse: nan
motorbike: nan
person: nan
pottedplant: nan
sheep: nan
sofa: nan
train: nan
tvmonitor: nan
Traceback (most recent call last):
File "train.py", line 268, in
var=opt.var,
File "train.py", line 224, in train
img_size=img_size,
File "F:\pytorch-yolov3-master-ul\test.py", line 125, in test
return mean_mAP, mean_R, mean_P
UnboundLocalError: local variable 'mean_mAP' referenced before assignment

What's the problem?

Path and File Separators

I ran detect.py but got nothing in output file, so i changed the results_img_path and results_txt_path as follows:

results_img_path = os.path.join(output, path.split('/')[-1].split('\')[-1])
results_txt_path = results_img_path.split('.')[-2] + '.txt'

Is it a small bug?

P.s. I'm a rookie in Deep Learning,no offense.

Sum False Positives from unassigned anchors

yolov3/models.py

Lines 207 to 213 in fd6619d

# Sum False Positives from unassigned anchors
FPe = torch.zeros(self.nC)
if batch_report:
i = torch.sigmoid(pred_conf[~mask]) > 0.5
if i.sum() > 0:
FP_classes = torch.argmax(pred_cls[~mask][i], 1)
FPe = torch.bincount(FP_classes, minlength=self.nC).float().cpu() # extra FPs

Can somebody explain this?

The pretrained weights on ImageNet

@glenn-jocher
Hi, First of all, thank you very much for your code. I have started training for 24 hours (10 epochs) on my GTX1080, but it seems that I can't load the pre-trained weights on ImageNet (the darknet53.comv). And it takes long time to train from scratch. TAT

Own dataset doesn't work on latest commit

For some reason I can't seem to train my own dataset on the latest commit. I am able to do it from an earlier commit e.g. this state. In this state if i run my training (with the exact same cfg files, dataset etc), i get these results after a couple epochs:

      Epoch      Batch          x          y          w          h       conf        cls      total          P          R   nTargets         TP         FP         FN       time
       0/99      99/99       1.49       1.47       7.46       12.8        111       7.38        141          0          0          3          0    2.9e+03          0      0.124
       1/99      99/99       1.26       1.19       1.99       3.07       16.1       7.35       30.9          0          0         10          0          1          8      0.131
       2/99      99/99       1.04       1.02      0.831       1.08       4.64       7.25       15.9          0          0          7          0          3          4      0.129
       3/99      99/99      0.756      0.796      0.666       0.79       3.67       7.25       13.9   0.000769    0.00187         10          0          2          7      0.129
       4/99      99/99       0.58      0.683      0.574      0.739       2.93       7.15       12.7    0.00314     0.0167          3          0          3          1      0.129
       5/99      99/99      0.455       0.54      0.462       0.64       2.62       7.14       11.9    0.00636     0.0221          6          0          6          1      0.132

If I use the latest commit, i get this:

      Epoch      Batch          x          y          w          h       conf        cls      total          P          R   nTargets         TP         FP         FN       time
       0/99      99/99       1.49       1.47       7.41       12.6        111       7.38        141          0          0          3          0          0          0      0.117
       1/99      99/99       4.91       4.93        nan        nan        nan        nan        nan          0          0         10          0          0          0      0.125
       2/99      99/99       5.52       5.24        nan        nan        nan        nan        nan          0          0          7          0          0          0      0.124
       3/99      99/99       5.23       5.21        nan        nan        nan        nan        nan          0          0         10          0          0          0      0.129
       4/99      99/99       5.35       5.17        nan        nan        nan        nan        nan          0          0          3          0          0          0      0.125
       5/99      99/99       5.65       5.41        nan        nan        nan        nan        nan          0          0          6          0          0          0      0.124
       6/99      99/99       5.13       5.13        nan        nan        nan        nan        nan          0          0          9          0          0          0      0.126

Also in the latest commit, line 197 of train.py causes the following error:

Traceback (most recent call last):
  File "/home/rick/Documents/yolov3-master/train.py", line 208, in <module>
    main(opt)
  File "/home/rick/Documents/yolov3-master/train.py", line 195, in main
    mAP, R, P = test.main(test.opt)
  File "/home/rick/Documents/yolov3-master/test.py", line 42, in main
    model.load_state_dict(checkpoint['model'])
  File "/media/rick/HDD/Env/yolov3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Darknet:
	size mismatch for module_list.81.conv_81.weight: copying a param of torch.Size([255, 1024, 1, 1]) from checkpoint, where the shape is torch.Size([303, 1024, 1, 1]) in current model.
	size mismatch for module_list.81.conv_81.bias: copying a param of torch.Size([255]) from checkpoint, where the shape is torch.Size([303]) in current model.
	size mismatch for module_list.93.conv_93.weight: copying a param of torch.Size([255, 512, 1, 1]) from checkpoint, where the shape is torch.Size([303, 512, 1, 1]) in current model.
	size mismatch for module_list.93.conv_93.bias: copying a param of torch.Size([255]) from checkpoint, where the shape is torch.Size([303]) in current model.
	size mismatch for module_list.105.conv_105.weight: copying a param of torch.Size([255, 256, 1, 1]) from checkpoint, where the shape is torch.Size([303, 256, 1, 1]) in current model.
	size mismatch for module_list.105.conv_105.bias: copying a param of torch.Size([255]) from checkpoint, where the shape is torch.Size([303]) in current model.

So I replaced it with (the old) code:

        with open('results.txt', 'a') as file:
            file.write(s + '\n')

I don't know if this is normal but i can't seem to find a solution.
Do you know why i get nan whilst using the exact same cfg files and data? My txt files for each image is spot on, the bounding boxes, width, height etc are all relative to the image width and height.

Multi-GPU Training

Hi,
Have you tried to run training on multiple gpus?
I am getting the below error when I try to do that.thank you

Traceback (most recent call last):
  File "train.py", line 194, in <module>
    main(opt)
  File "train.py", line 128, in main
    loss = model(imgs, targets, requestPrecision=True)
  File "/opt/anaconda/envs/pytorch_p35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda/envs/pytorch_p35/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 123, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/opt/anaconda/envs/pytorch_p35/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 133, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/opt/anaconda/envs/pytorch_p35/lib/python3.5/site-packages/torch/nn/parallel/parallel_apply.py", line 77, in parallel_apply
    raise output
  File "/opt/anaconda/envs/pytorch_p35/lib/python3.5/site-packages/torch/nn/parallel/parallel_apply.py", line 53, in _worker
    output = module(*input, **kwargs)
  File "/opt/anaconda/envs/pytorch_p35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'x'

Darknet Polynomial LR Curve

I found darknet's polynomial learning rate curve here:

case POLY:
    return net->learning_rate * pow(1 - (float)batch_num / net->max_batches, net->power);

https://github.com/pjreddie/darknet/blob/680d3bde1924c8ee2d1c1dea54d3e56a05ca9a26/src/network.c#L111

If I use power = 4 from parser.c then I plot the following curve (in MATLAB), assuming max_batches = 1563360 (160 epochs at batch_size 12, for 9771 batches/epoch). This leaves the final lr(1563360) = 0. This means that is is impossible for anyone to begin training a model from the official YOLOv3 weights and expect to resume training at lr = 0.001 with no problems. The model is going to clearly bounce out of its local minimum back into the huge gradients it first saw at epoch 0.

>> batch = 0:(9771*160);
>> lr = 1e-3 * (1 - batch./1563360).^4;
>> fig; plot(batch,lr,'.-'); xyzlabel('batch','learning rate'); fcnfontsize(14); fcntight;

Cuda out of memory while training

Hi,

first thanks to your work here.

I have a problem. always when i get from epoch 0 to 1 i get an "cuda out of memory" error.
I decreased the batch-size to 1 and still get the error. The first epoch runs fine from 8 down.

I am training on a custom dataset. My imagesizes vary.

Running it on a GTX1070.

Thanks in advance

Edit:
multi_scale is set to false
while training used memory of my gtx is:
2445/8116mib
After the first epoch the usage of vram bloats. I just could check it mid epochchange and it was nearly completly used till it ran out of memory again. Whats running that is so intensive in between epochs?

build_targets function

Can you explain why have you used the following constants? I have inspected a few different yolov3 implementation but none had a similar operation.

u = gi.float() * 0.4361538773074043 + gj.float() * 0.28012496588736746 + a.float() * 0.6627147212460307

5k.txt / 5k.part file extension

when i run :
~/PycharmProjects/yolov3-master$ python test.py -weights_path checkpoints/latest.pt
there is error in dataset.py:

img_all = np.stack(img_all)[:, :, :, ::-1].transpose(0, 3, 1, 2) #row 218(dataset.py)

Traceback (most recent call last):
File "test.py", line 59, in
for batch_i, (imgs, targets) in enumerate(dataloader):
File "/home/chenfei/PycharmProjects/yolov3-master/utils/datasets.py", line 218, in next
img_all = np.stack(img_all)[:, :, :, ::-1].transpose(0, 3, 1, 2) # BGR to RGB and cv2 to pytorch
File "/home/chenfei/anaconda3/lib/python3.6/site-packages/numpy/core/shape_base.py", line 349, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

can you help me ?

Unexpected key(s) in state_dict when running test.py

Hi,
Thank you very much for the code. But when I run the test.py with yolov3.pt/latest.pt Im getting the below error.
File "test.py", line 40, in <module> model.load_state_dict(checkpoint['model']) File "/home/xxiaofan/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 721, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Darknet: Unexpected key(s) in state_dict: "module_list.0.batch_norm_0.num_batches_tracked", "module_list.1.batch_norm_1.num_batches_tracked", "module_list.2.batch_norm_2.num_batches_tracked", "module_list.3.batch_norm_3.num_batches_tracked", "module_list.5.batch_norm_5.num_batches_tracked", "module_list.6.batch_norm_6.num_batches_tracked", "module_list.7.batch_norm_7.num_batches_tracked", "module_list.9.batch_norm_9.num_batches_tracked", "module_list.10.batch_norm_10.num_batches_tracked", "module_list.12.batch_norm_12.num_batches_tracked", "module_list.13.batch_norm_13.num_batches_tracked", "module_list.14.batch_norm_14.num_batches_tracked", "module_list.16.batch_norm_16.num_batches_tracked", "module_list.17.batch_norm_17.num_batches_tracked", "module_list.19.batch_norm_19.num_batches_tracked", "module_list.20.batch_norm_20.num_batches_tracked", "module_list.22.batch_norm_22.num_batches_tracked", "module_list.23.batch_norm_23.num_batches_tracked", "module_list.25.batch_norm_25.num_batches_tracked", "module_list.26.batch_norm_26.num_batches_tracked", "module_list.28.batch_norm_28.num_batches_tracked", "module_list.29.batch_norm_29.num_batches_tracked", "module_list.31.batch_norm_31.num_batches_tracked", "module_list.32.batch_norm_32.num_batches_tracked", "module_list.34.batch_norm_34.num_batches_tracked", "module_list.35.batch_norm_35.num_batches_tracked", "module_list.37.batch_norm_37.num_batches_tracked", "module_list.38.batch_norm_38.num_batches_tracked", "module_list.39.batch_norm_39.num_batches_tracked", "module_list.41.batch_norm_41.num_batches_tracked", "module_list.42.batch_norm_42.num_batches_tracked", "module_list.44.batch_norm_44.num_batches_tracked", "module_list.45.batch_norm_45.num_batches_tracked", "module_list.47.batch_norm_47.num_batches_tracked", "module_list.48.batch_norm_48.num_batches_tracked", "module_list.50.batch_norm_50.num_batches_tracked", "module_list.51.batch_norm_51.num_batches_tracked", "module_list.53.batch_norm_53.num_batches_tracked", "module_list.54.batch_norm_54.num_batches_tracked", "module_list.56.batch_norm_56.num_batches_tracked", "module_list.57.batch_norm_57.num_batches_tracked", "module_list.59.batch_norm_59.num_batches_tracked", "module_list.60.batch_norm_60.num_batches_tracked", "module_list.62.batch_norm_62.num_batches_tracked", "module_list.63.batch_norm_63.num_batches_tracked", "module_list.64.batch_norm_64.num_batches_tracked", "module_list.66.batch_norm_66.num_batches_tracked", "module_list.67.batch_norm_67.num_batches_tracked", "module_list.69.batch_norm_69.num_batches_tracked", "module_list.70.batch_norm_70.num_batches_tracked", "module_list.72.batch_norm_72.num_batches_tracked", "module_list.73.batch_norm_73.num_batches_tracked", "module_list.75.batch_norm_75.num_batches_tracked", "module_list.76.batch_norm_76.num_batches_tracked", "module_list.77.batch_norm_77.num_batches_tracked", "module_list.78.batch_norm_78.num_batches_tracked", "module_list.79.batch_norm_79.num_batches_tracked", "module_list.80.batch_norm_80.num_batches_tracked", "module_list.84.batch_norm_84.num_batches_tracked", "module_list.87.batch_norm_87.num_batches_tracked", "module_list.88.batch_norm_88.num_batches_tracked", "module_list.89.batch_norm_89.num_batches_tracked", "module_list.90.batch_norm_90.num_batches_tracked", "module_list.91.batch_norm_91.num_batches_tracked", "module_list.92.batch_norm_92.num_batches_tracked", "module_list.96.batch_norm_96.num_batches_tracked", "module_list.99.batch_norm_99.num_batches_tracked", "module_list.100.batch_norm_100.num_batches_tracked", "module_list.101.batch_norm_101.num_batches_tracked", "module_list.102.batch_norm_102.num_batches_tracked", "module_list.103.batch_norm_103.num_batches_tracked", "module_list.104.batch_norm_104.num_batches_tracked".

Did anybody train voc with this code?

I've trained coco with this code and the result is impressive. So I want to try voc on this code . I made train_list.txt as : cls_name x_center y_center width height. But the result is not as good as I prospected. Could anybody who successfully trained voc give me some suggestion?

Classification Loss: CE vs BCE

When developing the training code I found that replacing Binary Cross Entropy (BCE) loss with Cross Entropy (CE) loss significantly improves Precision, Recall and mAP. All show about 2X improvements using CE, though the YOLOv3 paper states these loss terms as BCE in darknet.

The two loss terms are on lines 162 and 163 of models.py. If anyone has any insight into this phenomenon I'd be very interested to hear it. For now you can swap the two back and forth. Note that SGD does not converge using either BCE or CE, so that issue appears independent of this one.

ce_vs_bce

tiny-yolo needed!

Could you please add the code for tiny-yolo?
I tried tiny-yolo.cfg and got a very bad result with map=0.04 , training about 50 epoch.

mAP Computation in test.py

COCO2014 mAP computation on official YOLOv3 weights corresponds to expected value of 0.58 (same as darknet), but mAP computation on trained checkpoints seems higher than should be. In particular many false positives do not seem to negatively impact mAP.

For example validation image 2 should have 4 people and 1 baseball bat. At epoch 37, I see ~140 objects detected. Precision and Recall look like this:
figure_1

Precision-Recall curve looks like this:
figure_1

AP for this image is then calculated as 0.78, which is strangely high for 4 TP and ~140 FP's.

AP = compute_ap(recall, precision)
Out[66]: 0.78596

Lastly, I believe mAP is supposed to be calculated per class in each image, but here all the classes seem combined.

Can't seem to load weights from a custom training session.

I've got a model trained using Darknet on a custom dataset containing 43 classes. This requires changing the number of filters for 3 conv layers from 255 to 144 (and of course the number of classes, although this shouldn't affect the bug I'm describing).

Alas, when loading this weight file (with the relevant cfg file), the network loading part crashes due to the weights file 'ending' before it should.
The actual error is generated by:
conv_w = torch.from_numpy(weights[ptr:ptr + num_w]).view_as(conv_layer.weight)

While the error is:

RuntimeError: invalid argument 2: size '[144 x 256 x 1 x 1]' is invalid for input with 36863 elements at /opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/TH/THStorage.cpp:84

Now, the size of weights is 61802511, and ptr+num_w is 61802512 - i.e. it's one float shorter.
I'll note that this doesn't happen when loading the original yolov3.weights file that comes with the original Darknet/YOLOv3 repository (in that one, the numbers perfectly align).

I can't think of a cause other than something causing the counter to advance one index too fast, but can't think of a reason why this would happen. Any ideas?

Thanks!

Resume training from official yolov3 weights

Thanks for your improvement of this YOLOv3 implementation.
I have just test the training ,got some problem .
I follow these steps.

  1. load the original yolov3.weight to the model
  2. train it on coco2014 with your train.py.
    3.Got the following logs ,the precision is down fast from 0.5->0.1. but recall is up to 0.35.
    see Screenshot here
    log

4.I save the weight with precision0.2, and run the detect.py
the result like this ,
000000000019
if I do not train,the orginal wight can get this result:
000000000019

I do not know whether I used wrong parameters or something else, lead to generation of many bbox .
could you give me some suggestion?
Thank you~

Optimizer Choice: SGD vs Adam

When developing the training code I found that SGD caused divergence very quickly at the default LR of 1e-4. Loss terms began to grow exponentially, becoming Inf within about 10 batches of starting training.

Adam always seems to converge in contrast, which is why I use it as the default optimizer in train.py. I don't understand why Adam works and SGD does not, as darknet uses SGD successfully. This is one of the key differences between darknet and this repo, so any insights into how we can get SGD to converge would be appreciated.

It might be that I simply don't have the proper learning rate (and scheduler) in place.

line 82 in train.py

# optimizer = torch.optim.SGD(model.parameters(), lr=.001, momentum=.9, weight_decay=5e-4)
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=1e-4, weight_decay=5e-4)

Something about mAP of COCO API

hello,I'm so sorry to have trouble with U.I have trained COCO data set myself,I get the test mAP about 55.9%,Ican't reach 58%,Is there some advices?(Batchsize= 20,train epoch=80).I get the result of your module use the yolov3.weights,then using COCO API to count the mAP ,which is much slower than the result of 57.9%. I find that there may be something different in deal with resize.The Darknet define a function itself to resize image to the size of 416X416,and fill the space around image use value 0.5,but our pytorch module use 0.502,and other pixel value is also different,Then I change the resize function in pytorch module as same as Darknet , this make a same detect result with Darknet.Sorry, My English is bad!

Could you tell me the meaning of 'Normalized xywh to pixel xyxy format'

# Normalized xywh to pixel xyxy format labels = labels0.copy() labels[:, 1] = ratio * w * (labels0[:, 1] - labels0[:, 3] / 2) + padw labels[:, 2] = ratio * h * (labels0[:, 2] - labels0[:, 4] / 2) + padh labels[:, 3] = ratio * w * (labels0[:, 1] + labels0[:, 3] / 2) + padw labels[:, 4] = ratio * h * (labels0[:, 2] + labels0[:, 4] / 2) + padh

Is (x, y) of (xywh) the center coordinate?

Windows vs Unix/MacOS pathnames

Hi
i am new to all this . i am trying to get it work in pycharm in windows but i get this error .
the detections work but no bounding boxes and no images in output folder.
Anyone please help.

Namespace(batch_size=1, cfg='cfg/yolov3.cfg', class_path='data/coco.names', conf_thres=0.5, image_folder='data/samples', img_size=416, nms_thres=0.45, output_folder='output', plot_flag=True, txt_out=False)
'rm' is not recognized as an internal or external command,
operable program or batch file.
0 (3, 416, 416) C:\Users--\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
Batch 0... (Done 1.172s)
image 0: 'data/samples\img1.jpg'
1 trucks
1 dogs
1 bicycles

Model Loss

Hey,
Following Section 2.2 of YOLO, I have a few questions about the loss calculation shown at the end of this issue.

  1. We are using λ coord = 5 from line 156 to line 159. Should we also use λ noobj = .5 in line 167?

  2. Why are we multiplying BCELoss with 1.5 in line 160? I have not found any reference to this in the papers.

  3. pred_conf gives us a [batch_size x anchor_number x grid_size x grid_size] tensor. Assuming batch_size = 1, anchor_number=3 and grid_size = 2, there are 12 elements in this tensor. If nM = 3, pred_conf[~mask] contains 9 elements, so does mask[~mask].float(). BCEWithLogitsLoss1 gives the sum of BCE loss for these 9 elements, whereas BCEWithLogitsLoss2 takes the mean of BCEWithLogitsLoss1 (i.e. divides it by 9 for our case). Now, my question is why are we multiplying BCEWithLogitsLoss2 with nM instead of using BCEWithLogitsLoss1 (should divide by batch_size too prob.) in line 167? There is no division in Section 2.2 of YOLO. Btw, pred_conf[~mask] could contain 15k elements normally, so we are practically ignoring the confidence loss in line 167.

  4. Similar to 3, we should use BCEWithLogitsLoss1 (should divide by batch_size too prob.) in line 163. Because
    BCEWithLogitsLoss1(pred_cls[mask], tcls.float()) / BCEWithLogitsLoss2(pred_cls[mask], tcls.float()) = batch_size x nM x number_of_classes.

  5. Why are we not dividing all the losses by the batch_size? As the batch_size increases, the loss increases too. However, we should minimize the expected loss per sample.

yolov3/models.py

Lines 155 to 167 in 9514e74

if nM > 0:
lx = 5 * MSELoss(x[mask], tx[mask])
ly = 5 * MSELoss(y[mask], ty[mask])
lw = 5 * MSELoss(w[mask], tw[mask])
lh = 5 * MSELoss(h[mask], th[mask])
lconf = 1.5 * BCEWithLogitsLoss1(pred_conf[mask], mask[mask].float())
# lcls = nM * CrossEntropyLoss(pred_cls[mask], torch.argmax(tcls, 1))
lcls = nM * BCEWithLogitsLoss2(pred_cls[mask], tcls.float())
else:
lx, ly, lw, lh, lcls, lconf = FT([0]), FT([0]), FT([0]), FT([0]), FT([0]), FT([0])
lconf += nM * BCEWithLogitsLoss2(pred_conf[~mask], mask[~mask].float())

ValueError: need at least one array to stack

I have encountered a problem and really need your help. Can you help me?
I want to train my own coco dataset. When I run train.py, the problem occur
like this:

Traceback (most recent call last):
  File "train.py", line 193, in <module>
    main(opt)
  File "train.py", line 116, in main
    for i, (imgs, targets) in enumerate(dataloader):
  File "/home/pytorch/github/yolov3/utils/datasets.py", line 189, in __next__
    img_all = np.stack(img_all)[:, :, :, ::-1].transpose(0, 3, 1, 2)  # BGR to RGB and cv2 to pytorch
  File "/home/pytorch/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/numpy/core/shape_base.py", line 349, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

I change the coco.data like this:

classes= 3
train=data/coco/trainval.txt
valid=data/coco/test.txt
names=data/coco.names
backup=backup/
eval=coco

and the train_path in train.py:

if platform == 'darwin':  # MacOS (local)
        train_path = data_config['train']
    else:  # linux (cloud, i.e. gcp)
        train_path = 'data/coco/trainval.txt'

Thank you in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.