Coder Social home page Coder Social logo

pytorch_yolov3's Introduction

YOLOv3 in Pytorch

Pytorch implementation of YOLOv3

What's New

  • 19/12/17 Now our repo exactly reproduces the train / eval performance of darknet!
  • 19/12/17 AP difference of evaluation between darknet and our repo has been eliminated by modifying the postprocess: one-hot class output to multiple-class output.
  • 19/05/05 We have verified that our repo exactly reproduces darknet's training using the default configuration, with COCO AP ~= 0.277 on train / val2017.
  • 19/02/12 verified inference COCO AP [IoU=0.50:0.95] = 0.297 with val2017, 416x416, batchsize = 8 and w/o random distortion
  • 18/11/27 COCO AP results of darknet (training) are reproduced with the same training conditions
  • 18/11/20 verified inference COCO AP [IoU=0.50:0.95] = 0.302 (paper: 0.310), val5k, 416x416
  • 18/11/20 verified inference COCO AP [IoU=0.50] = 0.544 (paper: 0.553), val5k, 416x416

Performance

Inference using yolov3.weights

Original (darknet) Ours (pytorch)
COCO AP[IoU=0.50:0.95], inference 0.310 0.311
COCO AP[IoU=0.50], inference 0.553 0.558

Training

The benchmark results below have been obtained by training models for 500k iterations on the COCO 2017 train dataset using darknet repo and our repo. The models have been evaluated on the COCO 2017 val dataset using our repo.

  • Our repo reproduces the results of the darknet repo exactly.
  • The AP of the pretrained weights (yolov3.weights) cannot be reproduced by the default setting of the darknet repo.
darknet weights darknet repo Ours (pytorch) Ours (pytorch)
batchsize ?? 4 4 8
speed [iter/min](*) ?? 19.2 19.4 21.0
COCO AP[IoU=0.50:0.95], training 0.311 0.284 0.283 0.298
COCO AP[IoU=0.50], training 0.558 0.488 0.491 0.511
(*) measured on Tesla V100

Installation

Requirements

  • Python 3.6.3+
  • Numpy (verified as operable: 1.15.2)
  • OpenCV
  • Matplotlib
  • Pytorch 1.0.0+ (verified as operable: v0.4.0, v1.0.0)
  • Cython (verified as operable: v0.29.1)
  • pycocotools (verified as operable: v2.0.0)
  • Cuda (verified as operable: v9.0)

optional:

  • tensorboard (>1.7.0)
  • tensorboardX
  • CuDNN (verified as operable: v7.0)

Docker Environment

We provide a Dockerfile to build an environment that meets the above requirements.

# build docker image
$ nvidia-docker build -t yolov3-in-pytorch-image --build-arg UID=`id -u` -f docker/Dockerfile .
# create docker container and login bash
$ nvidia-docker run -it -v `pwd`:/work --name yolov3-in-pytorch-container yolov3-in-pytorch-image
docker@4d69df209f4a:/work$ python train.py --help

Download pretrained weights

download the pretrained file from the author's project page:

$ mkdir weights
$ cd weights/
$ bash ../requirements/download_weights.sh

COCO 2017 dataset:

the COCO dataset is downloaded and unzipped by:

$ bash requirements/getcoco.sh

Inference with Pretrained Weights

To detect objects in the sample image, just run:

$ python demo.py --image data/mountain.png --detect_thresh 0.5 --weights_path weights/yolov3.weights

To run the demo using the non-interactive backend, add --background .

Train

$ python train.py --help
usage: train.py [-h] [--cfg CFG] [--weights_path WEIGHTS_PATH] [--n_cpu N_CPU]
                [--checkpoint_interval CHECKPOINT_INTERVAL]
                [--eval_interval EVAL_INTERVAL] [--checkpoint CHECKPOINT]
                [--checkpoint_dir CHECKPOINT_DIR] [--use_cuda USE_CUDA]
                [--debug] [--tfboard TFBOARD]

optional arguments:
  -h, --help            show this help message and exit
  --cfg CFG             config file. see readme
  --weights_path WEIGHTS_PATH
                        darknet weights file
  --n_cpu N_CPU         number of workers
  --checkpoint_interval CHECKPOINT_INTERVAL
                        interval between saving checkpoints
  --eval_interval EVAL_INTERVAL
                        interval between evaluations
  --checkpoint CHECKPOINT
                        pytorch checkpoint file path
  --checkpoint_dir CHECKPOINT_DIR
                        directory where checkpoint files are saved
  --use_cuda USE_CUDA
  --debug               debug mode where only one image is trained
  --tfboard TFBOARD     tensorboard path for logging

example:

$ python train.py --weights_path weights/darknet53.conv.74 --tfboard log

The train configuration is written in yaml files located in config folder. We use the following format:

MODEL:
  TYPE: YOLOv3
  BACKBONE: darknet53
  ANCHORS: [[10, 13], [16, 30], [33, 23],
            [30, 61], [62, 45], [59, 119],
            [116, 90], [156, 198], [373, 326]] # the anchors used in the YOLO layers
  ANCH_MASK: [[6, 7, 8], [3, 4, 5], [0, 1, 2]] # anchor filter for each YOLO layer
  N_CLASSES: 80 # number of object classes
TRAIN:
  LR: 0.001
  MOMENTUM: 0.9
  DECAY: 0.0005
  BURN_IN: 1000 # duration (iters) for learning rate burn-in
  MAXITER: 500000
  STEPS: (400000, 450000) # lr-drop iter points
  BATCHSIZE: 4 
  SUBDIVISION: 16 # num of minibatch inner-iterations
  IMGSIZE: 608 # initial image size
  LOSSTYPE: l2 # loss type for w, h
  IGNORETHRE: 0.7 # IoU threshold for learning conf
AUGMENTATION: # data augmentation section only for training
  RANDRESIZE: True # enable random resizing
  JITTER: 0.3 # amplitude of jitter for resizing
  RANDOM_PLACING: True # enable random placing
  HUE: 0.1 # random distortion parameter
  SATURATION: 1.5 # random distortion parameter
  EXPOSURE: 1.5 # random distortion parameter
  LRFLIP: True # enable horizontal flip
  RANDOM_DISTORT: False # enable random distortion in HSV space
TEST:
  CONFTHRE: 0.8 # not used
  NMSTHRE: 0.45 # same as official darknet
  IMGSIZE: 416 # this can be changed to measure acc-speed tradeoff
NUM_GPUS: 1

Evaluate COCO AP

$ python train.py --cfg config/yolov3_eval.cfg --eval_interval 1 [--ckpt ckpt_path] [--weights_path weights_path]

TODOs

  • Precision Evaluator (bbox, COCO metric)
  • Modify the target builder
  • Modify loss calculation
  • Training Scheduler
  • Weight initialization
  • Augmentation : Resizing
  • Augmentation : Jitter
  • Augmentation : Flip
  • Augmentation : Random Distortion
  • Add the YOLOv3 Tiny Model

Paper

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

[Paper] [Original Implementation] [Author's Project Page]

Credit

@article{yolov3,
  title={YOLOv3: An Incremental Improvement},
  author={Redmon, Joseph and Farhadi, Ali},
  journal = {arXiv},
  year={2018}
}

pytorch_yolov3's People

Contributors

hirotomusiker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch_yolov3's Issues

On License

Would appreciate if you can clarify if the license is MIT? Thanks.

`YOLOv3(cfg['MODEL'], ignore_thre=ignore_thre)` return error

Hi @hirotomusiker.

I got bellow error when execute python train.py --cfg config/yolov3_eval.cfg.

$sudo nvidia-docker build -t $USER/sample1 --build-arg UID=`id -u` -f docker/Dockerfile .
$sudo nvidia-docker run --rm -it -v `pwd`:/work --name $USER.sample1 $USER/sample1

# in Docker container
docker@eb8c716f1302:/work$ mkdir weights
docker@eb8c716f1302:/work$ cd weights/
docker@eb8c716f1302:/work/weights$ sh ../requirements/download_weights.sh
docker@eb8c716f1302:/work/weights$ cd ..

# Train.py
docker@eb8c716f1302:/work$ python train.py --cfg config/yolov3_eval.cfg
Setting Arguments.. :  Namespace(cfg='config/yolov3_eval.cfg', checkpoint=None, checkpoint_dir='checkpoints', checkpoint_interval=1000, debug=False, eval_interval=4000, n_cpu=0, tfboard=None, use_cuda=True, weights_path=None)
successfully loaded config file:  {'MODEL': {'TYPE': 'YOLOv3', 'BACKBONE': 'darknet53', 'ANCHORS': [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], 'ANCH_MASK': [[6, 7, 8], [3, 4, 5], [0, 1, 2]], 'N_CLASSES': 80}, 'TRAIN': {'LR': 0.0, 'MOMENTUM': 0.9, 'DECAY': 0.0005, 'BURN_IN': 0, 'MAXITER': 2, 'STEPS': '(99, 999)', 'BATCHSIZE': 1, 'SUBDIVISION': 1, 'LOSSTYPE': 'l2', 'IGNORETHRE': 0.7, 'IMGSIZE': 608}, 'AUGMENTATION': {'RANDRESIZE': False, 'JITTER': 0, 'RANDOM_PLACING': False, 'HUE': 0, 'SATURATION': 1, 'EXPOSURE': 1, 'LRFLIP': False, 'RANDOM_DISTORT': False}, 'TEST': {'CONFTHRE': 0.8, 'NMSTHRE': 0.45, 'IMGSIZE': 416}, 'NUM_GPUS': 1}
effective_batch_size = batch_size * iter_size = 1 * 1
Traceback (most recent call last):
  File "train.py", line 212, in <module>
    main()
  File "train.py", line 79, in main
    model = YOLOv3(cfg['MODEL'], ignore_thre=ignore_thre)
TypeError: __init__() got multiple values for argument 'ignore_thre'

And when I execute demo.py (change YOLOv3(cfg['MODEL']) to YOLOv3(cfg['MODEL'], ignore_thre=0.7)), got same error.

Please check this issue.

Hello,I have a question in the 'target assignment'?

I have a question in the 'target assignment'. You use the 'truth_box' and 'ref_anchors' to calculate the ious.But you use nine anchors.I think it should three.Why do you use nine?I can't understand it.Can you help me.Thanks!Thank you very much

The evaluation process is too slow!!!

Thanks for sharing this repo! I tried to use this repo to train my dataset. When I solved the problems I met and started the training, I found the evaluation process is really slow. I set the eval_interval to be 20 for observing the evaluation, but it was just stuck in there.
image
Looking forward to your reply, thanks!

Difference between .ckpt and .pt files

I am a beginner. Would you mind telling me why you use .ckpt files rather than .pt files here?
```

save checkpoint

    if args.checkpoint_dir and iter_i > 0 and (iter_i % args.checkpoint_interval == 0):
        torch.save({'iter': iter_i,
                    'model_state_dict': model.state_dict(),
                    'optimizer_state_dict': optimizer.state_dict(),
                    },
                    os.path.join(args.checkpoint_dir, "snapshot"+str(iter_i)+".ckpt"))
I find this in pytorch document:
>>> torch.load('tensors.pt')
# Load all tensors onto the CPU
>>> torch.load('tensors.pt', map_location=lambda storage, loc: storage)

Are .ckpt files originally used in tensorflow but similar with .pt files in pytorch?
Thank you!

Hi,

Hi,
The dimension 0 of obj_mask should be batchsize and 'b' stands for the current batch number in the for loop.

        obj_mask = torch.ones(batchsize, self.n_anchors,
                              fsize, fsize).type(dtype)
        for b in range(batchsize):
...
                obj_mask[b, a, j, i] = 1

So b changes from 0 to 39 if your batchsize is 40.

Which part of yolo_layer.py have you changed?

Originally posted by @hirotomusiker in #46 (comment)

Error when training yolo on my dataset:IndexError: index 69 is out of bounds for dimension 3 with size 68

I want to use yolo on my dataset ,and there are 10 classes.I changed the N_CLASSES in data/yolov3_default.cfg to 10. Nothing else is changed.But I met the following error when training:
File "train.py", line 216, in
main()
File "train.py", line 174, in main
loss = model(imgs, targets)
File "/home/omnisky/.conda/envs/dw/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/omnisky/dongwei/PyTorch_YOLOv3/models/yolov3.py", line 154, in forward
x, *loss_dict = module(x, targets)
File "/home/omnisky/.conda/envs/dw/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/omnisky/dongwei/PyTorch_YOLOv3/models/yolo_layer.py", line 164, in forward
obj_mask[b, a, j, i] = 1
IndexError: index 69 is out of bounds for dimension 3 with size 68
I would be appreciated if you can help me to solve this problem. waiting for you reply,thank you.

Low confidence score with high average precision

@hirotomusiker
I am training Yolov3 on a custom dataset with single class. The dataset has 1240 images of size 256x256 with 2029 class instances of size ~35x35. I have generated anchor boxes according to my dataset using k-means.
I trained it for 10+100 epochs at lr=0.001 (cyclic learning rate).

I am getting low confidence scores (0.1~0.45) for true positives, but relatively high average precision (~88 at conf_thresh=0.1).
Is this an expected behavior or there could be something I am not doing right?

What is the FPS(testing speed)?

Is darknet much faster than pytorch implementation?

While fps is 42 on the paper of gaussian yolov3(uses darknet), I get 10fps with pytorch_GaussianYOLOv3 https://github.com/motokimura/PyTorch_Gaussian_YOLOv3
(forked from this repo) using Tesla M60, image size=1600x1200.
Testing on the 416x416 imgs, fps=21.

I save the resized image by addingcv2.imgwrite('myname',img)after img, info_img = preprocess(img, imgsize, jitter=0) # info = (h, w, nh, nw, dx, dy) in:

img = cv2.imread(image_path)
        #Preprocess image
img_raw = img.copy()[:, :, ::-1].transpose((2, 0, 1))
img, info_img = preprocess(img, imgsize, jitter=0)  # info = (h, w, nh, nw, dx, dy)
img = np.transpose(img / 255., (2, 0, 1))
img = torch.from_numpy(img).float().unsqueeze(0)

if gpu >= 0:
    # Send model to GPU
    img = Variable(img.type(torch.cuda.FloatTensor))
else:
    img = Variable(img.type(torch.FloatTensor))

RuntimeError: Subtraction, the `-` operator,

File "train.py", line 172, in main
loss = model(imgs, targets)
File "anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)

return _C._VariableFunctions.rsub(self, other)
RuntimeError: Subtraction, the - operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~ or bitwise_not() operator instead.

padded label

@hirose31 ,
I have a question. Why do you padded label with zeros by max_label in this line? I can't understand the purpose of lines and why you chosen max_label = 50

padded_labels[range(len(labels))[:self.max_labels]
] = labels[:self.max_labels]

I have a question about resuming training

Thanks for your great work.
I want to resume training from snapshot.ckpt. So, I loaded checkpoint and changed the start of iteration.
But, the following error occurred.
line 190, in forward loss_xy = bceloss(output[..., :2], target[..., :2]) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py", line 504, in forward return F.binary_cross_entropy(input, target, weight=self.weight, reduction=self.reduction) File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 2027, in binary_cross_entropy input, target, weight, reduction_enum) RuntimeError: reduce failed to synchronize: device-side assert triggered

batchsize and subdivision

the default config set the batchsize to 4 and subdivision to 16, in darknet each time it takes batchsize/subdivision into training procedure. What the difference between your config and the darknet config?

Isn't it necessary to fix checkpoint corrections in demo.py as well?

Loading "weight" from a checkpoint with "demo.py" will fail.
Don't you need to fix "Merge pull request #28 from DeNA/feature/full_checkpoint " to "demo.py" as well.

'''
model.load_state_dict(torch.load(args.ckpt))
state = torch.load(args.ckpt)
if 'model_state_dict' in state.keys():
model.load_state_dict(state['model_state_dict'])
else:
model.load_state_dict(state)
'''

class YOLOV3 forward function return error when using costumed data

When I using my own dataset the model return error

 x = torch.cat((x, route_layers[1]), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 15 and 16 in dimension 2

The x shape is torch.Size([4, 256, 16, 16])
and the rout_layer[1] is torch.Size([4, 512, 15, 15]).

Why the last two dimensions number is 15?

IMPORTANT: PyTorch_YOLOv3 will no longer be maintained from April 2020

Hi everyone,
It's a pity to announce that hirotomusiker, the developer of PyTorch_YOLOv3, will no longer be able to maintain the project from April, as he will leave the DeNA organization.

I will come up with a way to keep on sharing my knowledge about the project, and would like to thank you guys for having given me great questions that always made the repo better.
please follow me:
github
medium
twitter

Hi, I simply copy your code and run,but i will also meet this problem.i changed the bctchsize to 8 when training in the visdrone dataset,

Hi,
The dimension 0 of obj_mask should be batchsize and 'b' stands for the current batch number in the for loop.

        obj_mask = torch.ones(batchsize, self.n_anchors,
                              fsize, fsize).type(dtype)
        for b in range(batchsize):
...
                obj_mask[b, a, j, i] = 1

So b changes from 0 to 39 if your batchsize is 40.

Which part of yolo_layer.py have you changed?

Originally posted by @hirotomusiker in #46 (comment)

zero AP

I train with custom dataset (1k images / 1 class) with default config and get zero AP, but loss is quit low.
The same dataset in another models works well.

[Iter 7950/50000] [lr 0.001000] [Losses: xy 0.002690, wh 0.008716, conf 2.793764, cls 0.002621, total 1.024159, imgsize 352]
[Iter 7960/50000] [lr 0.001000] [Losses: xy 0.001360, wh 0.000597, conf 2.744817, cls 0.001324, total 0.983985, imgsize 320]
[Iter 7970/50000] [lr 0.001000] [Losses: xy 0.002676, wh 0.005998, conf 2.818687, cls 0.002608, total 1.026828, imgsize 608]
[Iter 7980/50000] [lr 0.001000] [Losses: xy 0.005337, wh 0.004117, conf 2.839865, cls 0.005201, total 1.035145, imgsize 576]
[Iter 7990/50000] [lr 0.001000] [Losses: xy 0.004003, wh 0.001313, conf 2.766982, cls 0.003901, total 0.993268, imgsize 544]
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.10s).
Accumulating evaluation results...
DONE (t=0.02s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

What direction I have to look at to fix it?

confused about the train loss、size_average and the performance.

Hi, @hirotomusiker.
I come here again. As the title said, I am confused about the train loss、size_average and the performance. I have train the original darknet repo and this repo on my own dataset (3 classes). And I want to share the results here.
The params are same: MAXITER: 6000, STEPS: (4800, 5400), IMGSIZE: 608 (both for train and test).
With darknet, I gain the [email protected] as 79.0, and the final loss was 0.76 (avg).
image
With this repo, the [email protected] was 76.9, and the final loss was 4.7 (total).
image
It seens that with this repo, the loss is harder to converge. So I changed the params for this repo (MAXITER: 8000, STEPS: (6400, 7200)), and gain the [email protected] as 78.3, and the final loss was 8.2 (total).
image
image
So I have some questions.

  1. the performance seens different, may be caused by the shuffle of the dataset?
  2. the loss of this repo is larger and harder to converge compared to the darknet. What's the reason?
  3. in #44, you haved talked about the param size_average and said that the loss of darknet is also high?

Hi,I have a question about the 'yolo_layer.py'?

In the 'yolo_layer.py', the loss of the obj_conf have some error probability?
' loss_obj = self.bce_loss(output[..., 4], target[..., 4])'
this line in the code should have two part.According to the paper , I think this should have two parts is have obj and no_obj.I am not sure this,but I have some doubt,I think this should have two parts.Thanks!
Btw,this line ' obj_mask[b, a, j, i] = 1' should be zero?Hope your reply!Thanks!

About the backbone

Thanks for your awesome work , I have some questions about backbone:

  • I search the whole project and can not find the backbone weights loading point. How can I change the backbone weights , for example , I want use coco as pre-trained weights instead of darknet53.conv.74

  • Did you have plan to support yolov3-tiny ?

Iteration, subdivision and training time

Hi, I have some confusion about how batch size and subdivision works. As described by @hirotomusiker in #14 (comment),

1 iteration = 16 batches, 1 batch = 4 images. So 100,000 iter = 1,600,000 batches = 6,400,000 images. In trainvalno5k.part there are 117,264 images, so 6,400,000 / 117,264 ~ 54.6 dataset epoches. Am I correct?

Besides, in your repo homepage, there is a nice figure that shows you trained YOLOv3 for 500,000 iterations. How long did it take to achieve that?

if 'optimizer_state_dict' in state.keys():

Hi,
python train.py --cfg config/yolov3_default.cfg --weight /media/yolov3-pytoch/PyTorch-YOLOv3/weights/yolov3.weights --n_cpu 0 --checkpoint_interval 1000 --eval_interval 4000 --checkpoint str --checkpoint_dir ./checkpoints/

Traceback (most recent call last):
File "train.py", line 213, in
main()
File "train.py", line 145, in main
if 'optimizer_state_dict' in state.keys():
UnboundLocalError: local variable 'state' referenced before assignment

training our data

hello, I want to trian the network using my data .But I don not kown how to do that. Could you make a complaint for this problem. Thank you!!

Question about data preprocess

Hi, I found several confusing codes in data preprocess part.

  • in cocodataset.py, line 87, img = cv2.imread(img_file). As far as I know, opencv reads image in BGR, but in random_distort function, you treats it as RGB mode, use cv2.COLOR_RGB2HSV
  • in preprocess function of utils.py, you simply use 127 for 3 channels, is it better to use mean color here?
  • in yolobox2label function of utils.py, I think there is a fault in doc string about box, it might should be [y1, x1, y2, x2] in [0, 1]?

Is my code for [sorting the clusters according to areas] and [matching clusters with ANCH_MASK] right?

Using k-means , I get Boxes:
[[0.0671875 0.09814815]
[0.07552083 0.11388889]
[0.05572917 0.08796296]
[0.24010417 0.17222222]
[0.32552083 0.26759259]
[0.0953125 0.12685185]
[0.05833333 0.08333333]
[0.12604167 0.14074074]
[0.18177083 0.15740741]]

Then copy and paste the numbers above. Then flatten the array by hand and get a list, a:
(The numbers of Boxes and a are from different results, so they are indepent. Just show that a is from Boxes)
a=[0.452,0.69375,0.116,0.104,0.22,0.512,0.1,0.30303703,0.038,0.064,0.056,0.16266667,0.41,0.33866667,0.192,0.224,0.83766667,0.78469484]

Training yolov3, the imgs are resized to 606x608
My imgs' size is 19201080. After k-means, **I can ignore 19201080, right?**

b=[round(608*x) for x in a]
boxes=[]
areas=[]
for i in range(0,len(a),2):
    boxes.append([b[i],b[i+1]])
    areas.append([b[i]*b[i+1]])
#print(boxes)
#print(areas)
new_areas=sorted(areas)
new_boxes=[]
#print(new_areas)
for i in range(0,len(boxes)):
    mylist=list(range(0,len(boxes)))
    for j in mylist:
        if new_areas[i]==areas[j]:
            new_boxes.append(boxes[j])
            mylist.remove(j)
print(new_boxes)

new_boxes=
[[23, 39], [34, 99], [71, 63], [61, 184], [117, 136], [134, 311], [249, 206], [275, 422], [509, 477]]

Should the following part of gaussian_yolov3_default.cfg be like this?

  ANCHORS: [[23, 39], [34, 99], [71, 63], 
            [61, 184], [117, 136], [134, 311],
            [249, 206], [275, 422], [509, 477]]
  ANCH_MASK: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]

Many thanks!

got a strange mAP

got a mAP 0.67, did not no why, help~
just fllow your readme file:

python.sh train.py --cfg config/yolov3_eval.cfg --eval_interval 1 --weights_path yolov3.weight

and add code in train.py as fllowing:

def set_lr(tmp_lr):
    for param_group in optimizer.param_groups:
        param_group['lr'] = tmp_lr / batch_size / subdivision

if args.eval_interval == 1:
    ap50_95, ap50 = evaluator.evaluate(model)
    print('val/COCOAP50',ap50)
    print('val/COCOAP50_95',ap50_95)
    return

image

0.67!!?

This is an error!

HI,when I run the code ,I find the issue.
' tensor([[ 0., 0., 0., 0., 12.],
[ 0., 0., 0., 0., 14.],
[ 0., 0., 0., 0., 14.],
[ 0., 0., 0., 0., 14.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
'
I use the vocdataset to test,but the target coordinate is 0.
And, I use
'labels[:, 2] = labels[:, 2] * nw / w / maxsize
labels[:, 3] = labels[:, 3] * nh / h / maxsize'
to replace your code
'labels[:, 2] *= nw / w / maxsize
labels[:, 3] *= nh / h / maxsize'
because it always has error if I use this mode.
Do you know this is why?Thanks!

Learning rate is low.

Hi,

I am not sure but I tried to run the train.py and I found the initial learning rate is too low (I am not doing a fine tunning).

base_lr: 1.5625e-05
current_lr: 1.4641e-07 (used in the first 100 iterations)

However, I found the cfg file is a good fit, with 1e-3. Probably it is something related to this line:
base_lr = cfg['TRAIN']['LR'] / batch_size / subdivision

is it ok? why is the initial LR too low?

problem about NMS in the evaluation.

I just solve all the problems and begin training,everything runs ok,but when I try to make a test on the validation, I found it take much time to get the result given by the cocoapi evaluator. Then I test the single part of test and I found NMS takes about 3.7s for a single class (total class =10) per image,I wonder if there are somewhere I fogot to set?
tips: the code is the raw code and I just change the data for training, training runs ok.

final training results?

I find all the data augmentation methods are done, so wiil the final training results be released for comparison with the original training results?

Hello,I have a question about this code!

Why do you use this "x2 = (labels[:, 1] + labels[:, 3]) / w
y2 = (labels[:, 2] + labels[:, 4]) / h'' of the 'label2yolobox function' in the 'utils.py',I think it thats some error? Can you tell me this is why?Thank you very much!

Hello,I have a question in your code?

HI, I have a question in the 'yolobox2label' function
'def yolobox2label(box, info_img):
"""
Transform yolo box labels to yxyx box labels.
Args:
box (list): box data with the format of [yc, xc, w, h]
in the coordinate system after pre-processing.
info_img : tuple of h, w, nh, nw, dx, dy.
h, w (int): original shape of the image
nh, nw (int): shape of the resized image without padding
dx, dy (int): pad size
Returns:
label (list): box data with the format of [y1, x1, y2, x2]
in the coordinate system of the input image.
"""
h, w, nh, nw, dx, dy = info_img
y1, x1, y2, x2 = box
box_h = ((y2 - y1) / nh) * h
box_w = ((x2 - x1) / nw) * w
y1 = ((y1 - dy) / nh) * h
x1 = ((x1 - dx) / nw) * w
label = [y1, x1, y1 + box_h, x1 + box_w]
return label'
In there 'y1,x1,y2,x2=box' and 'box_h=((y2-y1) / nh)*h'.The problem is I think the 'box_h' is x2.Why do you use 'y2-y1',this means'w-yc'? So I am very uncertain in there.Can you explain this to me?Thank you very much!Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.