Questions about testing my own videos of different resolutions about nvds HOT 1 CLOSED

yavon818 commented on September 3, 2024

Questions about testing my own videos of different resolutions

from nvds.

Comments (1)

RaymondWang987 commented on September 3, 2024

I wonder if the input image size is fixed, as I run into some problems when I use the images of different resolutions (e.g., 688*384 ) , CUDA_VISIBLE_DEVICES=0 python infer_NVDS_dpt_bi.py --base_dir ./demo_outputs/dpt_init/kid_running/ --vnum kid_running --infer_w 688 --infer_h 384
let us begin test NVDS(DPT) demo
Load checkpoint: ./gmflow/checkpoints/gmflow_sintel-0c07dcb3.pth
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
warnings.warn(
Traceback (most recent call last):
File "infer_NVDS_dpt_bi.py", line 396, in
outputs = dpt.forward(rgb)
File "/data_ssd/home/z00647125/NVDS/dpt/models.py", line 115, in forward
inv_depth = super().forward(x).squeeze(dim=1)
File "/data_ssd/home/z00647125/NVDS/dpt/models.py", line 80, in forward
path_3 = self.scratch.refinenet3(path_4, layer_3_rn)
File "/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/data_ssd/home/z00647125/NVDS/dpt/blocks.py", line 372, in forward
output = self.skip_add.add(output, res)
File "/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/quantized/modules/functional_modules.py", line 43, in add
r = torch.add(x, y)
RuntimeError: The size of tensor a (44) must match the size of tensor b (43) at non-singleton dimension 3

The input image can be changed. However, the --infer_w and --infer_h should be set to integer multiples of 32. For example, you can use --infer_w 672 or --infer_w 704 in your case.

For initial depth predictors (DPT in your case) and our NVDS, the smallest feature maps produced by the backbone is 1/32 of the input width and height. But 688/32=21.5 thus there will be misalignment of resolutions (the 44 and 43 in your error message) in the down-sampling and up-sampling processes (both for DPT, Midas, or our NVDS).

from nvds.

Recommend Projects

Questions about testing my own videos of different resolutions about nvds HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent