Coder Social home page Coder Social logo

megvii-basedetection / yolox Goto Github PK

View Code? Open in Web Editor NEW
9.0K 77.0 2.1K 7.53 MB

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

License: Apache License 2.0

Python 92.07% C++ 7.93%
yolox yolov3 onnx tensorrt ncnn openvino pytorch megengine object-detection yolo

yolox's Introduction

yolox's People

Contributors

791136190 avatar ankandrew avatar armaxik avatar deftruth avatar developer0hye avatar f0xzz avatar fatescript avatar goatmessi7 avatar haolongzhangm avatar jario-jin avatar joker316701882 avatar manangoel99 avatar nihui avatar r-b-g-b avatar rangilyu avatar roachsinai avatar sauravkdeo avatar sped0n avatar stephanxu avatar swhl avatar tonysy avatar waynemao avatar woowonjin avatar wwqgtxx avatar xin-li-67 avatar xxr3376 avatar yancie-yjr avatar yuangpeng avatar yulv-git avatar zhiqwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolox's Issues

Training error:An error has been caught in function 'launch', process 'MainProcess'

I'm try to run 'python tools/train.py -n yolox-s -d 1 -b 6 --fp16 -o',and the following error occurs.

2021-07-22 15:18:00 | ERROR | yolox.core.launch:68 - An error has been caught in function 'launch', process 'MainProcess' (15429), thread 'MainThread' (139668994281664):
Traceback (most recent call last):

AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):

File "/home/lijq/deeplearn/YOLOX/yolox/data/datasets/mosaicdetection.py", line 91, in getitem
img, _labels, _, _ = self._dataset.pull_item(index)
File "/home/ll/deeplearn/YOLOX/yolox/data/datasets/coco.py", line 99, in pull_item
assert img is not None
AssertionError

PermissionError: [Errno 13] Permission denied: '/data'

(YOLOX) ncy@Lenovo:~/PycharmProjects/YOLOX$ python tools/demo.py image -n yolox-s -c yolox_s.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
Traceback (most recent call last):
File "/home/ncy/PycharmProjects/YOLOX/tools/demo.py", line 277, in
main(exp, args)
File "/home/ncy/PycharmProjects/YOLOX/tools/demo.py", line 210, in main
os.makedirs(file_name, exist_ok=True)
File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/os.py", line 215, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/os.py", line 225, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/data'

ImportError: /home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/exps/default/yolox_s.py doesn't contains class named 'Exp'

(YOLOX) ncy@Lenovo:~/PycharmProjects/YOLOX$ python tools/trt.py -n yolox-s -c yolox_s.pth.tar
2021-07-20 20:34:44.442 | ERROR | main::77 - An error has been caught in function '', process 'MainProcess' (180525), thread 'MainThread' (140272824201600):
Traceback (most recent call last):

File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/yolox/exp/build.py", line 13, in get_exp_by_file
current_exp = importlib.import_module(os.path.basename(exp_file).split(".")[0])
│ │ │ │ │ └ '/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/exps/default/yolox_s.py'
│ │ │ │ └ <function basename at 0x7f93ceec7e50>
│ │ │ └ <module 'posixpath' from '/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/posixpath.py'>
│ │ └ <module 'os' from '/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/os.py'>
│ └ <function import_module at 0x7f93cee564c0>
└ <module 'importlib' from '/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/importlib/init.py'>
File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
│ │ │ │ │ └ 0
│ │ │ │ └ None
│ │ │ └ 0
│ │ └ 'yolox_s'
│ └ <function _gcd_import at 0x7f93cef8c310>
└ <module 'importlib._bootstrap' (frozen)>
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 984, in _find_and_load_unlocked

ModuleNotFoundError: No module named 'yolox_s'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/home/ncy/PycharmProjects/YOLOX/tools/trt.py", line 77, in
main()
└ <function main at 0x7f93045813a0>

File "/home/ncy/PycharmProjects/YOLOX/tools/trt.py", line 36, in main
exp = get_exp(args.exp_file, args.name)
│ │ │ │ └ 'yolox-s'
│ │ │ └ Namespace(experiment_name=None, name='yolox-s', exp_file=None, ckpt='yolox_s.pth.tar')
│ │ └ None
│ └ Namespace(experiment_name=None, name='yolox-s', exp_file=None, ckpt='yolox_s.pth.tar')
└ <function get_exp at 0x7f9304581310>

File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/yolox/exp/build.py", line 50, in get_exp
return get_exp_by_name(exp_name)
│ └ 'yolox-s'
└ <function get_exp_by_name at 0x7f9304581280>
File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/yolox/exp/build.py", line 34, in get_exp_by_name
return get_exp_by_file(exp_path)
│ └ '/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/exps/default/yolox_s.py'
└ <function get_exp_by_file at 0x7f93045811f0>
File "/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/yolox/exp/build.py", line 16, in get_exp_by_file
raise ImportError("{} doesn't contains class named 'Exp'".format(exp_file))
└ '/home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/exps/default/yolox_s.py'

ImportError: /home/ncy/anaconda3/envs/YOLOX/lib/python3.9/site-packages/yolox-0.1.0-py3.9-linux-x86_64.egg/exps/default/yolox_s.py doesn't contains class named 'Exp'

cannot import name 'UnencryptedCookieSessionFactoryConfig'

s使用推理YOLOX-main python openvino_inference.py -m /project/train/yolox/YOLOX-main/yolox_nano/yolox_10.xml -i /home/1..jpg 出现问题如下
Traceback (most recent call last):
File "openvino_inference.py", line 17, in
from yolox.data.data_augment import preproc as preprocess
File "/project/train/yolox/YOLOX-main/yolox/init.py", line 4, in
from .utils import configure_module
File "/project/train/yolox/YOLOX-main/yolox/utils/init.py", line 10, in
from .ema import ModelEMA
File "/project/train/yolox/YOLOX-main/yolox/utils/ema.py", line 7, in
import apex
File "/usr/local/lib/python3.6/dist-packages/apex/init.py", line 13, in
from pyramid.session import UnencryptedCookieSessionFactoryConfig
ImportError: cannot import name 'UnencryptedCookieSessionFactoryConfig'

bug: The dataset sampler during single GPU training should be infinite

In yolox/exp/yolo_base.py:122

if is_distributed:
    batch_size = batch_size // dist.get_world_size()
    sampler = InfiniteSampler(
        len(self.dataset), seed=self.seed if self.seed else 0
    )
else:
    sampler = torch.utils.data.RandomSampler(self.dataset)

torch.utils.data.RandomSampler is not infinite, and will cause exception after an epoch of training.

trainning AP is 0.0 after 10 epochs

2021-07-21 17:32:55.097 | INFO | yolox.core.trainer:after_iter:245 - epoch: 10/300, iter: 970/973, mem: 8646Mb, iter_time: 1.567s, data_time: 0.027s, total_loss: 3.7, iou_loss: 1.9, l1_loss: 0.0, conf_loss: 1.2, cls_loss: 0.7, lr: 9.993e-03, size: 512, ETA: 4 days, 23:01:45
2021-07-21 17:32:59.200 | INFO | yolox.core.trainer:save_ckpt:307 - Save weights to ./YOLOX_outputs/yolox_voc_s
2021-07-21 17:32:59.893 | INFO | yolox.core.trainer:after_train:184 - Training of experiment is done and the best AP is 0.00

Training time too long

I train yolox-s with batch size 128 + 8 x V100, and the task takes about 2 days 6 hours for 300 epoch. Is the training time normal ?

Is this state normal?

(base) zjnyly@zjnyly:~/Desktop/YOLOX$ python ./tools/demo.py image -n yolox-s -c ./yolox_s.pth.tar --path ./assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
2021-07-20 19:03:23 | INFO     | __main__:219 - Args: Namespace(camid=0, ckpt='./yolox_s.pth.tar', conf=0.3, demo='image', exp_file=None, experiment_name='yolox_s', fp16=False, fuse=False, name='yolox-s', nms=0.65, path='./assets/dog.jpg', save_result=True, trt=False, tsize=640)
/home/zjnyly/.local/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
2021-07-20 19:03:23 | INFO     | __main__:229 - Model Summary: Params: 8.97M, Gflops: 26.81
2021-07-20 19:03:25 | INFO     | __main__:240 - loading checkpoint

The output after I run the code is as above, but I can't find any output in YOLOX_outputs folder.

multi-gpu training error

I train yolox-s with batch size=128 and 8xA100, but after 10 epochs, some error occur. Here is the error training log:

1554 main_func(*args)
1555 │ └ (╒══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════...
1556 └ <function main at 0x7f723fd8d3b0>
1557
1558 File "/workdir/yolox/tools/train.py", line 101, in main
1559 trainer.train()
1560 │ └ <function Trainer.train at 0x7f7159aa0cb0>
1561 └ <yolox.core.trainer.Trainer object at 0x7f7159125650>
1562
1563 File "/workdir/yolox/yolox/core/trainer.py", line 71, in train
1564 self.train_in_epoch()
1565 │ └ <function Trainer.train_in_epoch at 0x7f7159a35290>
1566 └ <yolox.core.trainer.Trainer object at 0x7f7159125650>
1567
1568 File "/workdir/yolox/yolox/core/trainer.py", line 81, in train_in_epoch
1569 self.after_epoch()
1570 │ └ <function Trainer.after_epoch at 0x7f715916bcb0>
1571 └ <yolox.core.trainer.Trainer object at 0x7f7159125650>
1572
1573 File "/workdir/yolox/yolox/core/trainer.py", line 212, in after_epoch
1574 all_reduce_norm(self.model)
1575 │ │ └ DistributedDataParallel(
1576 │ │ (module): YOLOX(
1577 │ │ (backbone): YOLOPAFPN(
1578 │ │ (backbone): CSPDarknet(
1579 │ │ (stem): Focus(
1580 │ │ ...
1581 │ └ <yolox.core.trainer.Trainer object at 0x7f7159125650>
1582 └ <function all_reduce_norm at 0x7f7163f70ef0>
1583
1584 File "/workdir/yolox/yolox/utils/allreduce_norm.py", line 99, in all_reduce_norm
1585 states = all_reduce(states, op="mean")
1586 │ └ OrderedDict([('module.backbone.backbone.stem.conv.bn.weight', tensor([0.5405, 0.7837, 1.4152, 1.2409, 0.7082, 1.0130, 1.0755,...
1587 └ <function all_reduce at 0x7f7163f70e60>
1588
1589 File "/workdir/yolox/yolox/utils/allreduce_norm.py", line 68, in all_reduce
1590 group = _get_global_gloo_group()
1591 └ <functools._lru_cache_wrapper object at 0x7f7163f6f0f0>
1592
1593 File "/workdir/yolox/yolox/utils/dist.py", line 103, in _get_global_gloo_group
1594 return dist.new_group(backend="gloo")
1595 │ └ <function new_group at 0x7f71647869e0>
1596 └ <module 'torch.distributed' from '/workdir/anaconda3/envs/yolox/lib/python3.7/site-packages/torch/distributed/init.py'>
1597
1598 File "/workdir/anaconda3/envs/yolox/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 2037, in new_group
1599 timeout=timeout)
1600 └ datetime.timedelta(seconds=1800)
1601 File "/workdir/anaconda3/envs/yolox/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 521, in _new_process_group_helper
1602 timeout=timeout)
1603 └ datetime.timedelta(seconds=1800)
1604
1605 RuntimeError: [enforce fail at /opt/conda/conda-bld/pytorch_1607370156314/work/third_party/gloo/gloo/transport/tcp/device.cc:83] ifa != nullptr. Unable to find address for: ib0

colab demo

Hi, can you please add a google colab demo for inference

yolox安装不上

你好,我使用命令pip3 install -v -e . # or python3 setup.py develop时,报如下错误:
image

permissionerror

Traceback (most recent call last): File "tools/demo.py", line 277, in <module> main(exp, args) File "tools/demo.py", line 210, in main os.makedirs(file_name, exist_ok=True) File "/home/lengyihong/anaconda3/lib/python3.8/os.py", line 213, in makedirs makedirs(head, exist_ok=exist_ok) File "/home/lengyihong/anaconda3/lib/python3.8/os.py", line 213, in makedirs makedirs(head, exist_ok=exist_ok) File "/home/lengyihong/anaconda3/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/data'
Has anyone meet this issue?

运行Demo出错

(base) lw@lw-System:~/work/YOLOX$ python3 tools/demo.py image -n yolox-s -c path/to/yolox_s.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result
2021-07-22 12:13:29 | INFO | main:219 - Args: Namespace(camid=0, ckpt='path/to/yolox_s.pth.tar', conf=0.3, demo='image', exp_file=None, experiment_name='yolox_s', fp16=False, fuse=False, name='yolox-s', nms=0.65, path='assets/dog.jpg', save_result=True, trt=False, tsize=640)
/home/lw/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
2021-07-22 12:13:29 | INFO | main:229 - Model Summary: Params: 8.97M, Gflops: 26.81

ConnectionResetError

`Exception in thread Thread-1:
Traceback (most recent call last):
File "/root/anaconda3/envs/yolox2/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Exception in thread Thread-1:
Traceback (most recent call last):
File "/root/anaconda3/envs/yolox2/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/root/anaconda3/envs/yolox2/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 25, in _pin_memory_loop
self.run()
File "/root/anaconda3/envs/yolox2/lib/python3.8/threading.py", line 870, in run
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd
fd = df.detach()
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/resource_sharer.py", line 58, in detach
self._target(*self._args, **self._kwargs)
File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 25, in _pin_memory_loop
return reduction.recv_handle(conn)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/reduction.py", line 189, in recv_handle
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/queues.py", line 116, in get
return recvfds(s, 1)[0]
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/reduction.py", line 157, in recvfds
msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_SPACE(bytes_size))
return _ForkingPickler.loads(res)
ConnectionResetError File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd
: [Errno 104] Connection reset by peer
fd = df.detach()
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 509, in Client
deliver_challenge(c, authkey)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 740, in deliver_challenge
response = connection.recv_bytes(256) # reject large message
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
Exception in thread Thread-1:
Traceback (most recent call last):
File "/root/anaconda3/envs/yolox2/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/root/anaconda3/envs/yolox2/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 25, in _pin_memory_loop
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd
fd = df.detach()
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 508, in Client
answer_challenge(c, authkey)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 757, in answer_challenge
response = connection.recv_bytes(256) # reject large message
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
2021-07-21 14:23:59 | INFO | yolox.core.trainer:183 - Training of experiment is done and the best AP is 0.00
2021-07-21 14:23:59 | ERROR | yolox.core.launch:104 - An error has been caught in function '_distributed_worker', process 'SpawnProcess-1' (75815), thread 'MainThread' (139758017422912):
Traceback (most recent call last):

File "", line 1, in
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
│ │ └ 3
│ └ 36
└ <function _main at 0x7f1bea0ad820>
File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/spawn.py", line 129, in _main
return self._bootstrap(parent_sentinel)
│ │ └ 3
│ └ <function BaseProcess._bootstrap at 0x7f1bea1178b0>

File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
│ └ <function BaseProcess.run at 0x7f1bea107ee0>

File "/root/anaconda3/envs/yolox2/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
│ │ │ │ │ └ {}
│ │ │ │ └
│ │ │ └ (<function _distributed_worker at 0x7f1b92170670>, 0, (<function main at 0x7f1b092a2160>, 4, 4, 0, 'nccl', 'tcp://127.0.0.1:5...
│ │ └
│ └ <function _wrap at 0x7f1bd6a41040>

File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
│ │ └ (<function main at 0x7f1b092a2160>, 4, 4, 0, 'nccl', 'tcp://127.0.0.1:57533', (╒══════════════════╤══════════════════════════...
│ └ 0
└ <function _distributed_worker at 0x7f1b92170670>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/core/launch.py", line 104, in _distributed_worker
main_func(*args)
│ └ (╒══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════...
└ <function main at 0x7f1b092a2160>

File "/data/zhangyong/workspace/detect/YOLOX-main/tools/train.py", line 101, in main
trainer.train()
│ └ <function Trainer.train at 0x7f1b10c650d0>
└ <yolox.core.trainer.Trainer object at 0x7f1b092c2070>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/core/trainer.py", line 70, in train
self.train_in_epoch()
│ └ <function Trainer.train_in_epoch at 0x7f1b09321430>
└ <yolox.core.trainer.Trainer object at 0x7f1b092c2070>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/core/trainer.py", line 80, in train_in_epoch
self.after_epoch()
│ └ <function Trainer.after_epoch at 0x7f1b093298b0>
└ <yolox.core.trainer.Trainer object at 0x7f1b092c2070>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/core/trainer.py", line 209, in after_epoch
all_reduce_norm(self.model)
│ │ └ DistributedDataParallel(
│ │ (module): YOLOX(
│ │ (backbone): YOLOPAFPN(
│ │ (backbone): CSPDarknet(
│ │ (stem): Focus(
│ │ ...
│ └ <yolox.core.trainer.Trainer object at 0x7f1b092c2070>
└ <function all_reduce_norm at 0x7f1bd6678940>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/utils/allreduce_norm.py", line 99, in all_reduce_norm
states = all_reduce(states, op="mean")
│ └ OrderedDict([('module.backbone.backbone.stem.conv.bn.weight', tensor([0.7139, 0.7288, 0.4413, 1.3903, 0.7023, 0.4169, 1.4530,...
└ <function all_reduce at 0x7f1bd66788b0>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/utils/allreduce_norm.py", line 68, in all_reduce
group = _get_global_gloo_group()
└ <functools._lru_cache_wrapper object at 0x7f1bd66783a0>

File "/data/zhangyong/workspace/detect/YOLOX-main/yolox/utils/dist.py", line 103, in _get_global_gloo_group
return dist.new_group(backend="gloo")
│ └ <function new_group at 0x7f1bd6cfc940>
└ <module 'torch.distributed' from '/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/distributed/init.py'>

File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 2032, in new_group
pg = _new_process_group_helper(group_world_size,
│ └ 4
└ <function _new_process_group_helper at 0x7f1bd6cfb790>
File "/root/anaconda3/envs/yolox2/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 517, in _new_process_group_helper
pg = ProcessGroupGloo(
└ <class 'torch.distributed.ProcessGroupGloo'>

RuntimeError: [enforce fail at /pytorch/third_party/gloo/gloo/transport/tcp/device.cc:83] ifa != nullptr. Unable to find address for: ib0`

推理速度的疑问

YOLOv5官方推理速度

image

论文里的YOLOv5的推理速度

image
都是V100为何相差这么多!!!求大佬解答

My running outputs remain the same for a long time

Hi, When I running the code, the outputs remain the same, it can't keep on running.

2021-07-21 12:24:37 | INFO | yolox.core.trainer:130 - Model Summary: Params: 99.00M, Gflops: 281.52
2021-07-21 12:24:37 | INFO | apex.amp.frontend:328 - Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods.
2021-07-21 12:24:37 | INFO | apex.amp.frontend:329 - Defaults for this optimization level are:
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - enabled : True
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - opt_level : O1
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - cast_model_type : None
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - patch_torch_functions : True
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - keep_batchnorm_fp32 : None
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - master_weights : None
2021-07-21 12:24:37 | INFO | apex.amp.frontend:331 - loss_scale : dynamic
2021-07-21 12:24:37 | INFO | apex.amp.frontend:336 - Processing user overrides (additional kwargs that are not None)...
2021-07-21 12:24:37 | INFO | apex.amp.frontend:354 - After processing overrides, optimization options are:
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - enabled : True
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - opt_level : O1
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - cast_model_type : None
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - patch_torch_functions : True
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - keep_batchnorm_fp32 : None
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - master_weights : None
2021-07-21 12:24:37 | INFO | apex.amp.frontend:356 - loss_scale : dynamic
2021-07-21 12:24:37 | INFO | apex.amp.scaler:69 - Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'")
2021-07-21 12:24:37 | INFO | yolox.core.trainer:283 - loading checkpoint for fine tuning
2021-07-21 12:24:38 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.0.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.0.weight in model is torch.Size([3, 320, 1, 1]).
2021-07-21 12:24:38 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.0.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.0.bias in model is torch.Size([3]).
2021-07-21 12:24:38 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.1.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.1.weight in model is torch.Size([3, 320, 1, 1]).
2021-07-21 12:24:38 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.1.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.1.bias in model is torch.Size([3]).
2021-07-21 12:24:38 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.2.weight in checkpoint is torch.Size([80, 320, 1, 1]), while shape of head.cls_preds.2.weight in model is torch.Size([3, 320, 1, 1]).
2021-07-21 12:24:38 | WARNING | yolox.utils.checkpoint:26 - Shape of head.cls_preds.2.bias in checkpoint is torch.Size([80]), while shape of head.cls_preds.2.bias in model is torch.Size([3]).
2021-07-21 12:24:38 | INFO | yolox.data.datasets.coco:44 - loading annotations into memory...
2021-07-21 12:24:38 | INFO | yolox.data.datasets.coco:44 - Done (t=0.28s)
2021-07-21 12:24:38 | INFO | pycocotools.coco:92 - creating index...
2021-07-21 12:24:38 | INFO | pycocotools.coco:92 - index created!
2021-07-21 12:24:38 | INFO | yolox.core.trainer:149 - init prefetcher, this might take one minute or less...

What's the problem?

detector

minimal detector example is necessary for yolox to become more popular.

How to remove apex?

Hello , could you please help me remove apex since i met difficulties when compile apex.
Thanks!

multiclass_nms后处理出错

Traceback (most recent call last):
File "D:/coding/YOLO-stream/yolox-ort.py", line 178, in
yolox_detect(args.model, args.names, frame)
File "D:/coding/YOLO-stream/yolox-ort.py", line 153, in yolox_detect
dets = multiclass_nms(boxes_xyxy, scores, nms_thr=0.65, score_thr=0.1)
File "D:/coding/YOLO-stream/yolox-ort.py", line 103, in multiclass_nms
return np.concatenate(final_dets, 0)
File "<array_function internals>", line 5, in concatenate
ValueError: need at least one array to concatenate
大佬们,我在ort推理的时候,当摄像头没有识别的目标或者目标仅有一个的情况下,它就会跳以上的bug,建议还是改成nms直接处理比较好~

no module named yolox

Traceback (most recent call last):
  File "./tools/demo.py", line 15, in <module>
    from yolox.data.data_augment import preproc
ModuleNotFoundError: No module named 'yolox'

how can I import yolox?

demo的video无法跑出结果

image有保存的dog结果
但是video的结果为空白

~/dev/YOLOX$ python3 tools/demo.py video -n yolox-s -c pretrained_models/yolox_s.pth.tar --path /assets/ch14_0616-0625.mp4 --conf 0.3 --nms 0.65 --tsize 640 --save_result
2021-07-22 14:34:23 | INFO | main:219 - Args: Namespace(camid=0, ckpt='pretrained_models/yolox_s.pth.tar', conf=0.3, demo='video', exp_file=None, experiment_name='yolox_s', fp16=False, fuse=False, name='yolox-s', nms=0.65, path='/assets/ch14_0616-0625.mp4', save_result=True, trt=False, tsize=640)
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
2021-07-22 14:34:23 | INFO | main:229 - Model Summary: Params: 8.97M, Gflops: 26.81
2021-07-22 14:34:28 | INFO | main:240 - loading checkpoint
2021-07-22 14:34:29 | INFO | main:245 - loaded checkpoint done.
2021-07-22 14:34:29 | INFO | main:183 - video save_path is ./YOLOX_outputs/yolox_s/vis_res/2021_07_22_14_34_29/ch14_0616-0625.mp4
但是检查该目录下 并没有mp4文件。

Cannot create Swish layer Mul_43 id:38

When run openvino c++ demo, the following error accured:
'''
Cannot create Swish layer Mul_43 id:38
C:\j\workspace\private-ci\ie\build-windows-icc2018\b\repos\closed-dldt\inference-engine\src\inference_engine\ie_ir_parser.cpp:424
C:\j\workspace\private-ci\ie\build-windows-icc2018\b\repos\closed-dldt\inference-engine\src\inference_engine\ie_core.cpp:493
'''

Env:
windows10, vs2019, openvino 2020.3.194

Model used:
YOLOX-S-nano, YOLOX-S downloaded from:
https://github.com/Megvii-BaseDetection/YOLOX/tree/main/demo/OpenVINO/cpp

Do I need to convert the model from onnx format to openvino type myself?

Segmentation fault (core dumped)

Hi, thanks the opening source code! When I train my own datasets with VOC style, the process of training is nomal, but the eval process raise 'Segmentation fault (core dumped)'. It seems the val_dataloader problem.

demo结果有问题

你好,我直接下载官方提供的yolox-nano模型跑demo程序,测试dog.jpg,发现输出结果稍微有点问题,多了一个目标。调用代码如下:
python tools/demo.py image -n yolox-nano -c ./models/yolox_nano.pth.tar --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 416 --save_result
显示结果如下图所示:
image
我打印输出outputs的结果如下:
tensor([[ 71.5272, 116.8364, 172.1584, 295.3528, 0.9587, 0.8872, 16.0000], [ 60.9691, 75.9739, 307.4900, 231.2223, 0.9442, 0.8800, 1.0000], [252.5838, 41.6913, 375.4042, 91.2227, 0.7677, 0.8870, 2.0000], [252.0094, 41.5889, 375.6715, 93.2385, 0.4352, 0.8493, 7.0000]], device='cuda:0')
另外,比较奇怪的是,我把pytorch模型转换成ncnn之后检测结果正常。直接使用的是官方提供的yolox.cpp文件,参数设置和上面相同,检测结果如下图所示:
image
16 = 0.84702 at 132.07 215.84 185.92 x 329.64 1 = 0.82856 at 113.21 140.14 454.67 x 286.03 2 = 0.68373 at 466.35 77.03 226.64 x 91.41
非常感谢您的工程!

执行demo的时候报错

resource.setrlimit(resource.RLIMIT_NOFILE, (ulimit_value, rlimit[1]))
ValueError: current limit exceeds maximum limit

这个问题请问怎么解决呀

comments on configs

Hi,

thank you for sharing such a wonderful work. I would appreciate it if you could add some comments on some params in config files / hyperparams in model structure. Those are really helpful for us beginners.

Best

Distributed worker error during training caused by multiprocess connection error

I was training yolox-s on 8 2080 gpus with batch size 64. Each time at 10/300 epoch the issue happens.
The error log is as follows:

Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in _pin_memory_loop
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 289, in rebuild_storage_fd
fd = df.detach()
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 508, in Client
answer_challenge(c, authkey)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 752, in answer_challenge
message = connection.recv_bytes(256) # reject large message
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
raise EOFError

EOFError

Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in _pin_memory_loop
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 289, in rebuild_storage_fd
fd = df.detach()
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 509, in Client
deliver_challenge(c, authkey)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 740, in deliver_challenge
response = connection.recv_bytes(256) # reject large message
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
Exception in thread Thread-1:
Traceback (most recent call last):
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in _pin_memory_loop
r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/queues.py", line 116, in get
return _ForkingPickler.loads(res)
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 289, in rebuild_storage_fd
fd = df.detach()
File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
with _resource_sharer.get_connection(self._id) as conn:
2021-07-21 09:11:10 | ERROR | yolox.core.launch:104 - An error has been caught in function '_distributed_worker', process 'SpawnProcess-1' (28776), thread 'MainThread' (140335833473600):
Traceback (most recent call last):

  File "<string>", line 1, in <module>
  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               │     │   └ 3
               │     └ 37
               └ <function _main at 0x7fa27a2465e0>
  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/spawn.py", line 129, in _main
    return self._bootstrap(parent_sentinel)
           │    │          └ 3
           │    └ <function BaseProcess._bootstrap at 0x7fa27a369820>
           └ <SpawnProcess name='SpawnProcess-1' parent=28716 started>
  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
    │    └ <function BaseProcess.run at 0x7fa27a37fe50>
    └ <SpawnProcess name='SpawnProcess-1' parent=28716 started>
  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
    │    │        │    │        │    └ {}
    │    │        │    │        └ <SpawnProcess name='SpawnProcess-1' parent=28716 started>
    │    │        │    └ (<function _distributed_worker at 0x7fa0130f5550>, 0, (<function main at 0x7fa00a620af0>, 8, 8, 0, 'nccl', 'tcp://127.0.0.1:4...
    │    │        └ <SpawnProcess name='SpawnProcess-1' parent=28716 started>
    │    └ <function _wrap at 0x7fa01731edc0>
    └ <SpawnProcess name='SpawnProcess-1' parent=28716 started>
  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
    │  │   └ (<function main at 0x7fa00a620af0>, 8, 8, 0, 'nccl', 'tcp://127.0.0.1:48781', (╒══════════════════╤══════════════════════════...
    │  └ 0
    └ <function _distributed_worker at 0x7fa0130f5550>

> File "/home/hdzhang/YOLOX/yolox/core/launch.py", line 104, in _distributed_worker
    main_func(*args)
    │          └ (╒══════════════════╤════════════════════════════════════════════════════════════════════════════════════════════════════════...
    └ <function main at 0x7fa00a620af0>

  File "/home/hdzhang/YOLOX/tools/train.py", line 101, in main
    trainer.train()
    │       └ <function Trainer.train at 0x7fa01767d5e0>
    └ <yolox.core.trainer.Trainer object at 0x7fa00a635d30>

  File "/home/hdzhang/YOLOX/yolox/core/trainer.py", line 70, in train
    self.train_in_epoch()
    │    └ <function Trainer.train_in_epoch at 0x7fa00a8e8310>
    └ <yolox.core.trainer.Trainer object at 0x7fa00a635d30>

  File "/home/hdzhang/YOLOX/yolox/core/trainer.py", line 80, in train_in_epoch
    self.after_epoch()
    │    └ <function Trainer.after_epoch at 0x7fa00a6005e0>
    └ <yolox.core.trainer.Trainer object at 0x7fa00a635d30>

  File "/home/hdzhang/YOLOX/yolox/core/trainer.py", line 209, in after_epoch
    all_reduce_norm(self.model)
    │               │    └ DistributedDataParallel(
    │               │        (module): YOLOX(
    │               │          (backbone): YOLOPAFPN(
    │               │            (backbone): CSPDarknet(
    │               │              (stem): Focus(
    │               │       ...
    │               └ <yolox.core.trainer.Trainer object at 0x7fa00a635d30>
    └ <function all_reduce_norm at 0x7fa016ef95e0>

  File "/home/hdzhang/YOLOX/yolox/utils/allreduce_norm.py", line 99, in all_reduce_norm
    states = all_reduce(states, op="mean")
             │          └ OrderedDict([('module.backbone.backbone.stem.conv.bn.weight', tensor([1.4156, 2.5198, 2.6882, 1.5280, 3.4103, 2.3906, 2.5711,...
             └ <function all_reduce at 0x7fa016ef9550>

  File "/home/hdzhang/YOLOX/yolox/utils/allreduce_norm.py", line 68, in all_reduce
    group = _get_global_gloo_group()
            └ <functools._lru_cache_wrapper object at 0x7fa016ef9040>

  File "/home/hdzhang/YOLOX/yolox/utils/dist.py", line 103, in _get_global_gloo_group
    return dist.new_group(backend="gloo")
           │    └ <function new_group at 0x7fa017800820>
           └ <module 'torch.distributed' from '/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/distributed/__ini...

  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 2694, in new_group
    pg = _new_process_group_helper(group_world_size,
         │                         └ 8
         └ <function _new_process_group_helper at 0x7fa0177ff1f0>
  File "/home/hdzhang/miniconda3/envs/centernet/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 616, in _new_process_group_helper
    pg = ProcessGroupGloo(
         └ <class 'torch._C._distributed_c10d.ProcessGroupGloo'>

Variable names in Focus Layer

Thanks for sharing your work!

patch_top_left = x[..., ::2, ::2]
patch_top_right = x[..., ::2, 1::2]
patch_bot_left = x[..., 1::2, ::2]
patch_bot_right = x[..., 1::2, 1::2]

I think that these variable names can cause misunderstanding. The results of indexing tensors are not top-left, top-right, bot-left, and bot-right patches. Its indexing method is related to indexing values whose index(x or y) is odd or even number.

It's hard to name these values... I tried to think about best name for these variables, but I failed.

 tensor_even_y_even_x = x[..., ::2, ::2] 
 tensor_even_y_odd_x = x[..., ::2, 1::2] 
 tensor_odd_y_even_x = x[..., 1::2, ::2] 
 tensor_odd_y_odd_x = x[..., 1::2, 1::2] 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.