Coder Social home page Coder Social logo

Comments (12)

yukang2017 avatar yukang2017 commented on July 26, 2024

Hi Naren,

You can try to directly remove these two sentences in the pcdet/utils/spconv_utils.py

if float(spconv.__version__[2:]) >= 2.2:
    spconv.constants.SPCONV_USE_DIRECT_TABLE = False

Regards,
Yukang Chen

from voxelnext.

naren2cmu avatar naren2cmu commented on July 26, 2024

Thanks Yukang. I tried that. It went past that error into another one.

root@369701605e02:/mnt/VoxelNeXt/tools# bash scripts/dist_test.sh 1 --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
+ NGPUS=1
+ PY_ARGS='--cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth'
+ python -m torch.distributed.launch --nproc_per_node=1 test.py --launcher pytorch --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
Traceback (most recent call last):
  File "test.py", line 14, in <module>
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from pcdet.models import load_data_to_gpu
  File "../pcdet/models/__init__.py", line 6, in <module>
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 1, in <module>
    from .detector3d_template import Detector3DTemplate
  File "../pcdet/models/detectors/detector3d_template.py", line 8, in <module>
    from .. import backbones_2d, backbones_3d, dense_heads, roi_heads
  File "../pcdet/models/backbones_3d/__init__.py", line 7, in <module>
    from .spconv_backbone_voxelnext_sps import VoxelResBackBone8xVoxelNeXtSPS
  File "../pcdet/models/backbones_3d/spconv_backbone_voxelnext_sps.py", line 4, in <module>
    from spconv.core import ConvAlgo
ModuleNotFoundError: No module named 'spconv.core'
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '-u', 'test.py', '--local_rank=0', '--launcher', 'pytorch', '--cfg_file', '/mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml', '--ckpt', '/mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth']' returned non-zero exit status 1.

It is missing spconv.core module. Are the dependencies not getting installed properly? I ran the python setup.py develop as well.

from voxelnext.

yukang2017 avatar yukang2017 commented on July 26, 2024

Hi,

What is your version of spconv?

Regards,
Yukang Chen

from voxelnext.

naren2cmu avatar naren2cmu commented on July 26, 2024

1.2.1 according to pip show spconv

Name: spconv
Version: 1.2.1
Summary: spatial sparse convolution for pytorch
Home-page: UNKNOWN
Author: Yan Yan
Author-email: [email protected]
License: UNKNOWN
Location: /opt/conda/lib/python3.7/site-packages
Requires: 
Required-by: 

from voxelnext.

yukang2017 avatar yukang2017 commented on July 26, 2024

Please try a new spconv version. For example, pip install spconv==2.1.25

from voxelnext.

naren2cmu avatar naren2cmu commented on July 26, 2024

Hi Yukang,

I got a different error after pip install spconv==2.1.25.

bash scripts/dist_test.sh 2 --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
+ NGPUS=2
+ PY_ARGS='--cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth'
+ python -m torch.distributed.launch --nproc_per_node=2 test.py --launcher pytorch --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
Traceback (most recent call last):
  File "test.py", line 14, in <module>
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from pcdet.models import load_data_to_gpu
  File "../pcdet/models/__init__.py", line 6, in <module>
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 1, in <module>
    from .detector3d_template import Detector3DTemplate
  File "../pcdet/models/detectors/detector3d_template.py", line 8, in <module>
    from .. import backbones_2d, backbones_3d, dense_heads, roi_heads
  File "../pcdet/models/backbones_3d/__init__.py", line 2, in <module>
    from .spconv_backbone import VoxelBackBone8x, VoxelResBackBone8x
  File "../pcdet/models/backbones_3d/spconv_backbone.py", line 30, in <module>
Traceback (most recent call last):
  File "test.py", line 14, in <module>
    class SparseBasicBlock(spconv.SparseModule):
AttributeError: module 'spconv' has no attribute 'SparseModule'
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from pcdet.models import load_data_to_gpu
  File "../pcdet/models/__init__.py", line 6, in <module>
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 1, in <module>
    from .detector3d_template import Detector3DTemplate
  File "../pcdet/models/detectors/detector3d_template.py", line 8, in <module>
    from .. import backbones_2d, backbones_3d, dense_heads, roi_heads
  File "../pcdet/models/backbones_3d/__init__.py", line 2, in <module>
    from .spconv_backbone import VoxelBackBone8x, VoxelResBackBone8x
  File "../pcdet/models/backbones_3d/spconv_backbone.py", line 30, in <module>
    class SparseBasicBlock(spconv.SparseModule):
AttributeError: module 'spconv' has no attribute 'SparseModule'
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '-u', 'test.py', '--local_rank=1', '--launcher', 'pytorch', '--cfg_file', '/mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml', '--ckpt', '/mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth']' returned non-zero exit status 1.

I tried other versions of spconv (CUDA variants) available in https://github.com/traveller59/spconv

image

I am getting a different runtime error with them.

bash scripts/dist_test.sh 2 --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
+ NGPUS=2
+ PY_ARGS='--cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth'
+ python -m torch.distributed.launch --nproc_per_node=2 test.py --launcher pytorch --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
Traceback (most recent call last):
  File "test.py", line 14, in <module>
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from pcdet.models import load_data_to_gpu
  File "../pcdet/models/__init__.py", line 6, in <module>
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 12, in <module>
    from .mppnet import MPPNet
  File "../pcdet/models/detectors/mppnet.py", line 9, in <module>
    from pcdet.datasets.augmentor import augmentor_utils, database_sampler
  File "../pcdet/datasets/__init__.py", line 15, in <module>
    from .argo2.argo2_dataset import Argo2Dataset
  File "../pcdet/datasets/argo2/argo2_dataset.py", line 11, in <module>
    from .argo2_utils.so3 import yaw_to_quat
  File "../pcdet/datasets/argo2/argo2_utils/so3.py", line 10, in <module>
    def quat_to_mat(quat_wxyz: Tensor) -> Tensor:
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1550, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/_recursive.py", line 583, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1550, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
RuntimeError: 
python value of type 'QuaternionCoeffOrder' cannot be used as a value:
  File "/opt/conda/lib/python3.7/site-packages/kornia/geometry/conversions.py", line 492
            raise ValueError(f"order must be one of {QuaternionCoeffOrder.__members__.keys()}")

    if order == QuaternionCoeffOrder.XYZW:
                ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        warnings.warn(
            "`XYZW` quaternion coefficient order is deprecated and"
'quaternion_to_rotation_matrix' is being compiled since it was called from 'quat_to_mat'
  File "../pcdet/datasets/argo2/argo2_utils/so3.py", line 19
        (...,3,3) 3D rotation matrices.
    """
    return C.quaternion_to_rotation_matrix(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...  <--- HERE
        quat_wxyz, order=C.QuaternionCoeffOrder.WXYZ
    )

Traceback (most recent call last):
  File "test.py", line 14, in <module>
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from pcdet.models import load_data_to_gpu
  File "../pcdet/models/__init__.py", line 6, in <module>
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 12, in <module>
    from .mppnet import MPPNet
  File "../pcdet/models/detectors/mppnet.py", line 9, in <module>
    from pcdet.datasets.augmentor import augmentor_utils, database_sampler
  File "../pcdet/datasets/__init__.py", line 15, in <module>
    from .argo2.argo2_dataset import Argo2Dataset
  File "../pcdet/datasets/argo2/argo2_dataset.py", line 11, in <module>
    from .argo2_utils.so3 import yaw_to_quat
  File "../pcdet/datasets/argo2/argo2_utils/so3.py", line 10, in <module>
    def quat_to_mat(quat_wxyz: Tensor) -> Tensor:
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1550, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/_recursive.py", line 583, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1550, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
RuntimeError: 
python value of type 'QuaternionCoeffOrder' cannot be used as a value:
  File "/opt/conda/lib/python3.7/site-packages/kornia/geometry/conversions.py", line 492
            raise ValueError(f"order must be one of {QuaternionCoeffOrder.__members__.keys()}")

    if order == QuaternionCoeffOrder.XYZW:
                ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        warnings.warn(
            "`XYZW` quaternion coefficient order is deprecated and"
'quaternion_to_rotation_matrix' is being compiled since it was called from 'quat_to_mat'
  File "../pcdet/datasets/argo2/argo2_utils/so3.py", line 19
        (...,3,3) 3D rotation matrices.
    """
    return C.quaternion_to_rotation_matrix(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~...  <--- HERE
        quat_wxyz, order=C.QuaternionCoeffOrder.WXYZ
    )

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 261, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 257, in main
    cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/bin/python', '-u', 'test.py', '--local_rank=1', '--launcher', 'pytorch', '--cfg_file', '/mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml', '--ckpt', '/mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth']' returned non-zero exit status 1.

from voxelnext.

yukang2017 avatar yukang2017 commented on July 26, 2024

Hi,

I think this is the problem of kornia. Would you please check its version?

Regards,
Yukang Chen

from voxelnext.

naren2cmu avatar naren2cmu commented on July 26, 2024

Hi,

It was 0.5.7.

I uninstalled it and installed the latest one (0.6.11) using pip install kornia and reran the evaluation command. I got the following error (test.py failed).

 bash scripts/dist_test.sh 2 --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
+ NGPUS=2
+ PY_ARGS='--cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth'
+ python -m torch.distributed.launch --nproc_per_node=2 test.py --launcher pytorch --cfg_file /mnt/tools/cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml --ckpt /mnt/VoxelNeXt/voxelnext_nuscenes_kernel1.pth
/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py:188: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  FutureWarning,
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
*****************************************
Traceback (most recent call last):
  File "test.py", line 14, in <module>
Traceback (most recent call last):
  File "test.py", line 14, in <module>
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from pcdet.models import load_data_to_gpu
  File "../pcdet/models/__init__.py", line 6, in <module>
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 1, in <module>
    from eval_utils import eval_utils
  File "/mnt/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in <module>
    from .detector3d_template import Detector3DTemplate
  File "../pcdet/models/detectors/detector3d_template.py", line 6, in <module>
    from ...ops.iou3d_nms import iou3d_nms_utils    
from pcdet.models import load_data_to_gpu  File "../pcdet/ops/iou3d_nms/iou3d_nms_utils.py", line 9, in <module>

  File "../pcdet/models/__init__.py", line 6, in <module>
    from . import iou3d_nms_cuda
ImportError: ../pcdet/ops/iou3d_nms/iou3d_nms_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv
    from .detectors import build_detector
  File "../pcdet/models/detectors/__init__.py", line 1, in <module>
    from .detector3d_template import Detector3DTemplate
  File "../pcdet/models/detectors/detector3d_template.py", line 6, in <module>
    from ...ops.iou3d_nms import iou3d_nms_utils
  File "../pcdet/ops/iou3d_nms/iou3d_nms_utils.py", line 9, in <module>
    from . import iou3d_nms_cuda
ImportError: ../pcdet/ops/iou3d_nms/iou3d_nms_cuda.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 114) of binary: /opt/conda/bin/python
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 195, in <module>
    main()
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 191, in main
    launch(args)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py", line 176, in launch
    run(args)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/run.py", line 756, in run
    )(*cmd_args)
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 132, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/opt/conda/lib/python3.7/site-packages/torch/distributed/launcher/api.py", line 248, in launch_agent
    failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
test.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2023-04-18_19:27:29
  host      : 97863fda069f
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 115)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-04-18_19:27:29
  host      : 97863fda069f
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 114)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Do you, by any chance, have an alternate docker image that you may be using (with all the dependencies properly satisfied)?

Thank you
Naren

from voxelnext.

yukang2017 avatar yukang2017 commented on July 26, 2024

Hi,

Have you ever installed OpenPCDet?

python3 setup.py develop

Sorry. I did not use docker.

Regards,
Yukang Chen

from voxelnext.

naren2cmu avatar naren2cmu commented on July 26, 2024

Yes, I did python3 setup.py develop

I was using the docker image you specified at https://github.com/dvlab-research/VoxelNeXt/tree/master/docker
docker pull djiajun1206/pcdet:pytorch1.6 . I guess that has problems then.

I tried to build the docker image as well. (docker build ./ -t openpcdet-docker).

That failed too with the following error.

W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease' is not signed.

from voxelnext.

yukang2017 avatar yukang2017 commented on July 26, 2024

Hi, what is the output from python setup.py develop?

In addition, it seems that the problem is still in the installation of openpcdet.

from voxelnext.

AmrinKareem avatar AmrinKareem commented on July 26, 2024

I have problems with installing openpcdet too. @yukang2017 I have been getting the error upon running python3 setup.py develop:

ERROR: No supported gcc/g++ host compiler found.
Use 'nvcc -ccbin ' to specify a host compiler.

I have kornia 0.7.0, spconv-117 and pytorch with cuda 11.7 installed.

from voxelnext.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.