Coder Social home page Coder Social logo

0000duck / yolov7 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lucasjinreal/yolov7_d2

0.0 0.0 0.0 2.46 MB

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

License: GNU General Public License v3.0

Shell 0.01% C++ 3.55% Python 96.37% CMake 0.08%

yolov7's Introduction

In short: YOLOv7 added instance segmentation to YOLO arch. Also many transformer backbones, archs included. If you look carefully, you'll find our ultimate vision is to make YOLO great again by the power of transformers, as well as multi-tasks training. YOLOv7 achieves mAP 43, AP-s exceed MaskRCNN by 10 with a convnext-tiny backbone while simillar speed with YOLOX-s, more models listed below, it's more accurate and even more lighter!

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ Just another yolo variant implemented based on detectron2. Be note that YOLOv7 doesn't meant to be a successor of yolo family, 7 is just a magic and lucky number. Instead, YOLOv7 extend yolo into many other vision tasks, such as instance segmentation, one-stage keypoints detection etc..

The supported matrix in YOLOv7 are:

  • YOLOv4 contained with CSP-Darknet53;
  • YOLOv7 arch with resnets backbone;
  • YOLOv7 arch with resnet-vd backbone (likely as PP-YOLO), deformable conv, Mish etc;
  • GridMask augmentation from PP-YOLO included;
  • Mosiac transform supported with a custom datasetmapper;
  • YOLOv7 arch Swin-Transformer support (higher accuracy but lower speed);
  • YOLOv7 arch Efficientnet + BiFPN;
  • YOLOv5 style positive samples selection, new coordinates coding style;
  • RandomColorDistortion, RandomExpand, RandomCrop, RandomFlip;
  • CIoU loss (DIoU, GIoU) and label smoothing (from YOLOv5 & YOLOv4);
  • YOLOF also included;
  • YOLOv7 Res2net + FPN supported;
  • Pyramid Vision Transformer v2 (PVTv2) supported;
  • WBF (Weighted Box Fusion), this works better than NMS, link;
  • YOLOX like head design and anchor design, also training support;
  • YOLOX s,m,l backbone and PAFPN added, we have a new combination of YOLOX backbone and pafpn;
  • YOLOv7 with Res2Net-v1d backbone, we found res2net-v1d have a better accuracy then darknet53;
  • Added PPYOLOv2 PAN neck with SPP and dropblock;
  • YOLOX arch added, now you can train YOLOX model (anchor free yolo) as well;
  • DETR: transformer based detection model and onnx export supported, as well as TensorRT acceleration;
  • AnchorDETR: Faster converge version of detr, now supported!
  • Almost all models can export to onnx;
  • Supports TensorRT deployment for DETR and other transformer models;
  • It will integrate with wanwu, a torch-free deploy framework run fastest on your target platform.

โš ๏ธ Important note: YOLOv7 on Github not the latest version, many features are closed-source but you can get it from https://manaai.cn

Features are ready but not opensource yet:

  • Convnext training on YOLOX, higher accuracy than original YOLOX;
  • GFL loss support;
  • MobileVit-V2 backbone available;
  • CSPRep-Resnet: a repvgg style resnet used in PP-YOLOE but in pytorch rather than paddle;
  • VitDet support;
  • Simple-FPN support from VitDet;
  • PP-YOLOE head supported;

If you want get full version YOLOv7, either become a contributor or get from https://manaai.cn .

๐Ÿ†• News!

  • 2022.06.13: New model YOLOX-Convnext-tiny got a 41.3 43 mAP beats yolox-s, AP-small even higher!;
  • 2022.06.09: GFL, general focal loss supported;
  • 2022.05.26: Added YOLOX-ConvNext config;
  • 2022.05.18: DINO and DABDetr are about added, new records on coco up to 63.3 AP!
  • 2022.05.09: Big new function added! We adopt YOLOX with Keypoints Head!, model still under train, but you can check at code already;
  • 2022.04.23: We finished the int8 quantization on SparseInst! It works perfect! Download the onnx try it our by your self.
  • 2022.04.15: Now, we support the SparseInst onnx expport!
  • 2022.03.25: New instance seg supported! 40 FPS @ 37 mAP!! Which is fast;
  • 2021.09.16: First transformer based DETR model added, will explore more DETR series models;
  • 2021.08.02: YOLOX arch added, you can train YOLOX as well in this repo;
  • 2021.07.25: We found YOLOv7-Res2net50 beat res50 and darknet53 at same speed level! 5% AP boost on custom dataset;
  • 2021.07.04: Added YOLOF and we can have a anchor free support as well, YOLOF achieves a better trade off on speed and accuracy;
  • 2021.06.25: this project first started.
  • more

๐ŸŒน Contribution Wanted

If you have spare time or if you have GPU card, then help YOLOv7 become more stronger! Here is the guidance of contribute:

  1. Claim task: I have some ideas but do not have enough time to do it, if you want implement it, claim the task, I will give u fully advise on how to do, and you can learn a lot from it;
  2. Test mAP: When you finished new idea implementation, create a thread to report experiment mAP, if it work, then merge into our main master branch;
  3. Pull request: YOLOv7 is open and always tracking on SOTA and light models, if a model is useful, we will merge it and deploy it, distribute to all users want to try.

Here are some tasks need to be claimed:

Just join our in-house contributor plan, you can share our newest code with your contribution!

๐Ÿ’โ€โ™‚๏ธ Results

YOLOv7 Instance Face & Detection

๐Ÿง‘โ€๐Ÿฆฏ Installation && Quick Start

Special requirements (other version may also work, but these are tested, with best performance, including ONNX export best support):

  • torch 1.11 (stable version)
  • onnx
  • onnx-simplifier 0.3.7
  • alfred-py latest
  • detectron2 latest

If you using lower version torch, onnx exportation might not work as our expected.

๐Ÿค” Features

Some highlights of YOLOv7 are:

  • A simple and standard training framework for any detection && instance segmentation tasks, based on detectron2;
  • Supports DETR and many transformer based detection framework out-of-box;
  • Supports easy to deploy pipeline thought onnx.
  • This is the only framework support YOLOv4 + InstanceSegmentation in single stage style;
  • Easily plugin into transformers based detector;

We are strongly recommend you send PR if you have any further development on this project, the only reason for opensource it is just for using community power to make it stronger and further. It's very welcome for anyone contribute on any features!

๐Ÿง™โ€โ™‚๏ธ Pretrained Models

model backbone input aug APval AP FPS weights
SparseInst R-50 640 โœ˜ 32.8 - 44.3 model
SparseInst R-50-vd 640 โœ˜ 34.1 - 42.6 model
SparseInst (G-IAM) R-50 608 โœ˜ 33.4 - 44.6 model
SparseInst (G-IAM) R-50 608 โœ“ 34.2 34.7 44.6 model
SparseInst (G-IAM) R-50-DCN 608 โœ“ 36.4 36.8 41.6 model
SparseInst (G-IAM) R-50-vd 608 โœ“ 35.6 36.1 42.8 model
SparseInst (G-IAM) R-50-vd-DCN 608 โœ“ 37.4 37.9 40.0 model
SparseInst (G-IAM) R-50-vd-DCN 640 โœ“ 37.7 38.1 39.3 model
SparseInst Int8 onnx google drive

๐Ÿง™โ€โ™‚๏ธ Models trained in YOLOv7

model backbone input aug AP AP50 APs FPS weights
YoloFormer-Convnext-tiny Convnext-tiny 800 โœ“ 43 63.7 26.5 39.3 model
YOLOX-s - 800 โœ“ 40.5 - - 39.3 model

note: We post AP-s here because we want to know how does small object performance in related model, it was notablely higher small-APs for transformer backbone based model! Some of above model might not opensourced but we provide weights.

๐Ÿฅฐ Demo

Run a quick demo would be like:

python3 demo.py --config-file configs/wearmask/darknet53.yaml --input ./datasets/wearmask/images/val2017 --opts MODEL.WEIGHTS output/model_0009999.pth

Run SparseInst:

python demo.py --config-file configs/coco/sparseinst/sparse_inst_r50vd_giam_aug.yaml --video-input ~/Movies/Videos/86277963_nb2-1-80.flv -c 0.4 --opts MODEL.WEIGHTS weights/sparse_inst_r50vd_giam_aug_8bc5b3.pth

an update based on detectron2 newly introduced LazyConfig system, run with a LazyConfig model using:

python3 demo_lazyconfig.py --config-file configs/new_baselines/panoptic_fpn_regnetx_0.4g.py --opts train.init_checkpoint=output/model_0004999.pth

๐Ÿ˜Ž Train

For training, quite simple, same as detectron2:

python train_net.py --config-file configs/coco/darknet53.yaml --num-gpus 8

If you want train YOLOX, you can using config file configs/coco/yolox_s.yaml. All support arch are:

  • YOLOX: anchor free yolo;
  • YOLOv7: traditional yolo with some explorations, mainly focus on loss experiments;
  • YOLOv7P: traditional yolo merged with decent arch from YOLOX;
  • YOLOMask: arch do detection and segmentation at the same time (tbd);
  • YOLOInsSeg: instance segmentation based on YOLO detection (tbd);

๐Ÿ˜Ž Rules

There are some rules you must follow to if you want train on your own dataset:

  • Rule No.1: Always set your own anchors on your dataset, using tools/compute_anchors.py, this applys to any other anchor-based detection methods as well (EfficientDet etc.);
  • Rule No.2: Keep a faith on your loss will goes down eventually, if not, dig deeper to find out why (but do not post issues repeated caused I might don't know either.).
  • Rule No.3: No one will tells u but it's real: do not change backbone easily, whole params coupled with your backbone, dont think its simple as you think it should be, also a Deeplearning engineer is not an easy work as you think, the whole knowledge like an ocean, and your knowledge is just a tiny drop of water...
  • Rule No.4: must using pretrain weights for transoformer based backbone, otherwise your loss will bump;

Make sure you have read rules before ask me any questions.

๐Ÿ”จ Export ONNX && TensorRTT && TVM

  1. detr:
python export_onnx.py --config-file detr/config/file

this works has been done, inference script included inside tools.

  1. AnchorDETR:

anchorDETR also supported training and exporting to ONNX.

  1. SparseInst: Sparsinst already supported exporting to onnx!!
python export_onnx.py --config-file configs/coco/sparseinst/sparse_inst_r50_giam_aug.yaml --video-input ~/Videos/a.flv  --opts MODEL.WEIGHTS weights/sparse_inst_r50_giam_aug_2b7d68.pth INPUT.MIN_SIZE_TEST 512

If you are on a CPU device, please using:

python export_onnx.py --config-file configs/coco/sparseinst/sparse_inst_r50_giam_aug.yaml --input images/COCO_val2014_000000002153.jpg --verbose  --opts MODEL.WEIGHTS weights/sparse_inst_r50_giam_aug_2b7d68.pth MODEL.DEVICE 'cpu'

Then you can have weights/sparse_inst_r50_giam_aug_2b7d68_sim.onnx generated, this onnx can be inference using ORT without any unsupported ops.

๐Ÿค’๏ธ Performance

Here is a dedicated performance compare with other packages.

tbd.

๐Ÿชœ Some Tiny Object Datasets supported

  • Wearmask: support VOC, Yolo, coco 3 format. You can using coco format here. Download from: ้“พๆŽฅ: https://pan.baidu.com/s/1ozAgUFLqfTXLp-iOecddqQ ๆๅ–็ : xgep . Using configs/wearmask to train this dataset.
  • more: to go.

๐Ÿ‘‹ Detection Results

Image Detections

๐Ÿ˜ฏ Dicussion Group

Wechat QQ
image.png image.png
  • if wechat expired, please contact me update via github issue. group for general discussion, not only for yolov7.

๐Ÿ€„๏ธ Some Exp Visualizations

GridMask Mosaic

ยฉ๏ธ License

Code released under GPL license. Please pull request to this source repo before you make your changes public or commercial usage. All rights reserved by Lucas Jin.

yolov7's People

Contributors

acai66 avatar laughing-q avatar lucasjinreal avatar luoxiaofeifly avatar tomguluson92 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.