Coder Social home page Coder Social logo

Comments (1)

RaymondWang987 avatar RaymondWang987 commented on September 3, 2024

I wonder if the input image size is fixed, as I run into some problems when I use the images of different resolutions (e.g., 688*384 ) , CUDA_VISIBLE_DEVICES=0 python infer_NVDS_dpt_bi.py --base_dir ./demo_outputs/dpt_init/kid_running/ --vnum kid_running --infer_w 688 --infer_h 384
let us begin test NVDS(DPT) demo
Load checkpoint: ./gmflow/checkpoints/gmflow_sintel-0c07dcb3.pth
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
warnings.warn(
Traceback (most recent call last):
File "infer_NVDS_dpt_bi.py", line 396, in
outputs = dpt.forward(rgb)
File "/data_ssd/home/z00647125/NVDS/dpt/models.py", line 115, in forward
inv_depth = super().forward(x).squeeze(dim=1)
File "/data_ssd/home/z00647125/NVDS/dpt/models.py", line 80, in forward
path_3 = self.scratch.refinenet3(path_4, layer_3_rn)
File "/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/data_ssd/home/z00647125/NVDS/dpt/blocks.py", line 372, in forward
output = self.skip_add.add(output, res)
File "/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/quantized/modules/functional_modules.py", line 43, in add
r = torch.add(x, y)
RuntimeError: The size of tensor a (44) must match the size of tensor b (43) at non-singleton dimension 3

The input image can be changed. However, the --infer_w and --infer_h should be set to integer multiples of 32. For example, you can use --infer_w 672 or --infer_w 704 in your case.

For initial depth predictors (DPT in your case) and our NVDS, the smallest feature maps produced by the backbone is 1/32 of the input width and height. But 688/32=21.5 thus there will be misalignment of resolutions (the 44 and 43 in your error message) in the down-sampling and up-sampling processes (both for DPT, Midas, or our NVDS).

from nvds.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.