nvidia / minkowskiengine Goto Github PK

Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors

Home Page: https://nvidia.github.io/MinkowskiEngine

License: Other

Python 36.74% Makefile 0.39% C++ 30.42% Cuda 32.34% Shell 0.04% Dockerfile 0.07%

neural-network computer-vision sparse-tensors convolutional-neural-networks semantic-segmentation auto-differentiation spatio-temporal-analysis space-time deep-learning 3d-convolutional-network

minkowskiengine's Introduction

Minkowski Engine

The Minkowski Engine is an auto-differentiation library for sparse tensors. It supports all standard neural network layers such as convolution, pooling, unpooling, and broadcasting operations for sparse tensors. For more information, please visit the documentation page.

News

2021-08-11 Docker installation instruction added
2021-08-06 All installation errors with pytorch 1.8 and 1.9 have been resolved.
2021-04-08 Due to recent errors in pytorch 1.8 + CUDA 11, it is recommended to use anaconda for installation.
2020-12-24 v0.5 is now available! The new version provides CUDA accelerations for all coordinate management functions.

Example Networks

The Minkowski Engine supports various functions that can be built on a sparse tensor. We list a few popular network architectures and applications here. To run the examples, please install the package and run the command in the package root directory.

Examples	Networks and Commands
Semantic Segmentation	`python -m examples.indoor`
Classification	`python -m examples.classification_modelnet40`
Reconstruction	`python -m examples.reconstruction`
Completion	`python -m examples.completion`
Detection

Sparse Tensor Networks: Neural Networks for Spatially Sparse Tensors

Compressing a neural network to speedup inference and minimize memory footprint has been studied widely. One of the popular techniques for model compression is pruning the weights in convnets, is also known as sparse convolutional networks. Such parameter-space sparsity used for model compression compresses networks that operate on dense tensors and all intermediate activations of these networks are also dense tensors.

However, in this work, we focus on spatially sparse data, in particular, spatially sparse high-dimensional inputs and 3D data and convolution on the surface of 3D objects, first proposed in Siggraph'17. We can also represent these data as sparse tensors, and these sparse tensors are commonplace in high-dimensional problems such as 3D perception, registration, and statistical data. We define neural networks specialized for these inputs as sparse tensor networks and these sparse tensor networks process and generate sparse tensors as outputs. To construct a sparse tensor network, we build all standard neural network layers such as MLPs, non-linearities, convolution, normalizations, pooling operations as the same way we define them on a dense tensor and implemented in the Minkowski Engine.

We visualized a sparse tensor network operation on a sparse tensor, convolution, below. The convolution layer on a sparse tensor works similarly to that on a dense tensor. However, on a sparse tensor, we compute convolution outputs on a few specified points which we can control in the generalized convolution. For more information, please visit the documentation page on sparse tensor networks and the terminology page.

Dense Tensor	Sparse Tensor

Features

Unlimited high-dimensional sparse tensor support
All standard neural network layers (Convolution, Pooling, Broadcast, etc.)
Dynamic computation graph
Custom kernel shapes
Multi-GPU training
Multi-threaded kernel map
Multi-threaded compilation
Highly-optimized GPU kernels

Requirements

Ubuntu >= 14.04
CUDA >= 10.1.243 and the same CUDA version used for pytorch (e.g. if you use conda cudatoolkit=11.1, use CUDA=11.1 for MinkowskiEngine compilation)
pytorch >= 1.7 To specify CUDA version, please use conda for installation. You must match the CUDA version pytorch uses and CUDA version used for Minkowski Engine installation. conda install -y -c nvidia -c pytorch pytorch=1.8.1 cudatoolkit=10.2)
python >= 3.6
ninja (for installation)
GCC >= 7.4.0

Installation

You can install the Minkowski Engine with pip, with anaconda, or on the system directly. If you experience issues installing the package, please checkout the the installation wiki page. If you cannot find a relevant problem, please report the issue on the github issue page.

PIP installation
Conda installation
Python installation
Docker installation

Pip

The MinkowskiEngine is distributed via PyPI MinkowskiEngine which can be installed simply with pip. First, install pytorch following the instruction. Next, install openblas.

sudo apt install build-essential python3-dev libopenblas-dev
pip install torch ninja
pip install -U MinkowskiEngine --install-option="--blas=openblas" -v --no-deps

# For pip installation from the latest source
# pip install -U git+https://github.com/NVIDIA/MinkowskiEngine --no-deps

If you want to specify arguments for the setup script, please refer to the following command.

# Uncomment some options if things don't work
# export CXX=c++; # set this if you want to use a different C++ compiler
# export CUDA_HOME=/usr/local/cuda-11.1; # or select the correct cuda version on your system.
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps \
#                           \ # uncomment the following line if you want to force cuda installation
#                           --install-option="--force_cuda" \
#                           \ # uncomment the following line if you want to force no cuda installation. force_cuda supercedes cpu_only
#                           --install-option="--cpu_only" \
#                           \ # uncomment the following line to override to openblas, atlas, mkl, blas
#                           --install-option="--blas=openblas" \

Anaconda

MinkowskiEngine supports both CUDA 10.2 and cuda 11.1, which work for most of latest pytorch versions.

CUDA 10.2

We recommend python>=3.6 for installation. First, follow the anaconda documentation to install anaconda on your computer.

sudo apt install g++-7  # For CUDA 10.2, must use GCC < 8
# Make sure `g++-7 --version` is at least 7.4.0
conda create -n py3-mink python=3.8
conda activate py3-mink

conda install openblas-devel -c anaconda
conda install pytorch=1.9.0 torchvision cudatoolkit=10.2 -c pytorch -c nvidia

# Install MinkowskiEngine
export CXX=g++-7
# Uncomment the following line to specify the cuda home. Make sure `$CUDA_HOME/nvcc --version` is 10.2
# export CUDA_HOME=/usr/local/cuda-10.2
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas"

# Or if you want local MinkowskiEngine
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
export CXX=g++-7
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas

CUDA 11.X

We recommend python>=3.6 for installation. First, follow the anaconda documentation to install anaconda on your computer.

conda create -n py3-mink python=3.8
conda activate py3-mink

conda install openblas-devel -c anaconda
conda install pytorch=1.9.0 torchvision cudatoolkit=11.1 -c pytorch -c nvidia

# Install MinkowskiEngine

# Uncomment the following line to specify the cuda home. Make sure `$CUDA_HOME/nvcc --version` is 11.X
# export CUDA_HOME=/usr/local/cuda-11.1
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas"

# Or if you want local MinkowskiEngine
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas

System Python

Like the anaconda installation, make sure that you install pytorch with the same CUDA version that nvcc uses.

# install system requirements
sudo apt install build-essential python3-dev libopenblas-dev

# Skip if you already have pip installed on your python3
curl https://bootstrap.pypa.io/get-pip.py | python3

# Get pip and install python requirements
python3 -m pip install torch numpy ninja

git clone https://github.com/NVIDIA/MinkowskiEngine.git

cd MinkowskiEngine

python setup.py install
# To specify blas, CXX, CUDA_HOME and force CUDA installation, use the following command
# export CXX=c++; export CUDA_HOME=/usr/local/cuda-11.1; python setup.py install --blas=openblas --force_cuda

Docker

git clone https://github.com/NVIDIA/MinkowskiEngine
cd MinkowskiEngine
docker build -t minkowski_engine docker

Once the docker is built, check it loads MinkowskiEngine correctly.

docker run MinkowskiEngine python3 -c "import MinkowskiEngine; print(MinkowskiEngine.__version__)"

CPU only build and BLAS configuration (MKL)

The Minkowski Engine supports CPU only build on other platforms that do not have NVidia GPUs. Please refer to quick start for more details.

Quick Start

To use the Minkowski Engine, you first would need to import the engine. Then, you would need to define the network. If the data you have is not quantized, you would need to voxelize or quantize the (spatial) data into a sparse tensor. Fortunately, the Minkowski Engine provides the quantization function (MinkowskiEngine.utils.sparse_quantize).

Creating a Network

import torch.nn as nn
import MinkowskiEngine as ME

class ExampleNetwork(ME.MinkowskiNetwork):

    def __init__(self, in_feat, out_feat, D):
        super(ExampleNetwork, self).__init__(D)
        self.conv1 = nn.Sequential(
            ME.MinkowskiConvolution(
                in_channels=in_feat,
                out_channels=64,
                kernel_size=3,
                stride=2,
                dilation=1,
                bias=False,
                dimension=D),
            ME.MinkowskiBatchNorm(64),
            ME.MinkowskiReLU())
        self.conv2 = nn.Sequential(
            ME.MinkowskiConvolution(
                in_channels=64,
                out_channels=128,
                kernel_size=3,
                stride=2,
                dimension=D),
            ME.MinkowskiBatchNorm(128),
            ME.MinkowskiReLU())
        self.pooling = ME.MinkowskiGlobalPooling()
        self.linear = ME.MinkowskiLinear(128, out_feat)

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = self.pooling(out)
        return self.linear(out)

Forward and backward using the custom network

    # loss and network
    criterion = nn.CrossEntropyLoss()
    net = ExampleNetwork(in_feat=3, out_feat=5, D=2)
    print(net)

    # a data loader must return a tuple of coords, features, and labels.
    coords, feat, label = data_loader()
    input = ME.SparseTensor(feat, coordinates=coords)
    # Forward
    output = net(input)

    # Loss
    loss = criterion(output.F, label)

Discussion and Documentation

For discussion and questions, please use [email protected]. For API and general usage, please refer to the MinkowskiEngine documentation page for more detail.

For issues not listed on the API and feature requests, feel free to submit an issue on the github issue page.

Known Issues

Specifying CUDA architecture list

In some cases, you need to explicitly specify which compute capability your GPU uses. The default list might not contain your architecture.

export TORCH_CUDA_ARCH_LIST="5.2 6.0 6.1 7.0 7.5 8.0 8.6+PTX"; python setup.py install --force_cuda

Unhandled Out-Of-Memory thrust::system exception

There is a known issue in thrust with CUDA 10 that leads to an unhandled thrust exception. Please refer to the issue for detail.

Too much GPU memory usage or Frequent Out of Memory

There are a few causes for this error.

Out of memory during a long running training

MinkowskiEngine is a specialized library that can handle different number of points or different number of non-zero elements at every iteration during training, which is common in point cloud data. However, pytorch is implemented assuming that the number of point, or size of the activations do not change at every iteration. Thus, the GPU memory caching used by pytorch can result in unnecessarily large memory consumption.

Specifically, pytorch caches chunks of memory spaces to speed up allocation used in every tensor creation. If it fails to find the memory space, it splits an existing cached memory or allocate new space if there's no cached memory large enough for the requested size. Thus, every time we use different number of point (number of non-zero elements) with pytorch, it either split existing cache or reserve new memory. If the cache is too fragmented and allocated all GPU space, it will raise out of memory error.

To prevent this, you must clear the cache at regular interval with torch.cuda.empty_cache().

CUDA 11.1 Installation

wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda_11.1.1_455.32.00_linux.run
sudo sh cuda_11.1.1_455.32.00_linux.run --toolkit --silent --override

# Install MinkowskiEngine with CUDA 11.1
export CUDA_HOME=/usr/local/cuda-11.1; pip install MinkowskiEngine -v --no-deps

Running the MinkowskiEngine on nodes with a large number of CPUs

The MinkowskiEngine uses OpenMP to parallelize the kernel map generation. However, when the number of threads used for parallelization is too large (e.g. OMP_NUM_THREADS=80), the efficiency drops rapidly as all threads simply wait for multithread locks to be released. In such cases, set the number of threads used for OpenMP. Usually, any number below 24 would be fine, but search for the optimal setup on your system.

export OMP_NUM_THREADS=<number of threads to use>; python <your_program.py>

Citing Minkowski Engine

If you use the Minkowski Engine, please cite:

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, CVPR'19, [pdf]

@inproceedings{choy20194d,
  title={4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks},
  author={Choy, Christopher and Gwak, JunYoung and Savarese, Silvio},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3075--3084},
  year={2019}
}

For multi-threaded kernel map generation, please cite:

@inproceedings{choy2019fully,
  title={Fully Convolutional Geometric Features},
  author={Choy, Christopher and Park, Jaesik and Koltun, Vladlen},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={8958--8966},
  year={2019}
}

For strided pooling layers for high-dimensional convolutions, please cite:

@inproceedings{choy2020high,
  title={High-dimensional Convolutional Networks for Geometric Pattern Recognition},
  author={Choy, Christopher and Lee, Junha and Ranftl, Rene and Park, Jaesik and Koltun, Vladlen},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

For generative transposed convolution, please cite:

@inproceedings{gwak2020gsdn,
  title={Generative Sparse Detection Networks for 3D Single-shot Object Detection},
  author={Gwak, JunYoung and Choy, Christopher B and Savarese, Silvio},
  booktitle={European conference on computer vision},
  year={2020}
}

Unittest

For unittests and gradcheck, use torch >= 1.7

Projects using Minkowski Engine

Please feel free to update the wiki page to add your projects!

Projects using MinkowskiEngine
Segmentation: 3D and 4D Spatio-Temporal Semantic Segmentation, CVPR'19
Representation Learning: Fully Convolutional Geometric Features, ICCV'19
3D Registration: Learning multiview 3D point cloud registration, CVPR'20
3D Registration: Deep Global Registration, CVPR'20
Pattern Recognition: High-Dimensional Convolutional Networks for Geometric Pattern Recognition, CVPR'20
Detection: Generative Sparse Detection Networks for 3D Single-shot Object Detection, ECCV'20
Image matching: Sparse Neighbourhood Consensus Networks, ECCV'20

minkowskiengine's People

Contributors

Stargazers

Watchers

Forkers

hyzcn stevenlol azgo14 bhanujeet codeaudit whitemike889 bigdatamatta dendisuhubdy amit2014 nunofernandes-plight rl-gan-vision-privacy-finance-projects ezhangle luciferhe macos zhangjian94cn hunglethanh9 panda4us vishalbelsare boocent kentang-mit xuyongzhi demiguo thended bestsonny euivmar nnu-gisa yuanming-hu daghty zeyuxiao1997 jlqzzz sarsbug secezar freeworkearth plkms suedongchu bowod inyukwo1 asdlei99 carebent nikhilaravi hassanakbari syyunn tejamoy zivzone pandinosaurus manik-hossain dantodor zxduan90 tchaton yongjunhe11 stephanierogers-ml sainatarajan junha-l simleek roozbehsanaei rancheng e7dal nicolas-chaulet zjhthu hiyyg 5l1v3r1 tonghe90 albertotono wbhu sunlex0717 schmohlo 82magnolia chomolungma edraizen inkyusa jingyibo123 shnhrtkyk cv-ip asurada404 poodarchu ahme0307 vexilligera yueyedeai shubodh zeta1999 stevenxmy gchen-apollo ignacio-rocco wisclmy0611 jake-zhi ccinc zaiweizhang youngjoo-kim ai-hub-deep-learning-fundamental tpatten cowherdlei tigsch narmadabalasooriya xmyqsh geomni fengqiliu1221 ssbagalkar whuchenlin liu115 cheng-chi

minkowskiengine's Issues

error when running example.py

Hi, all,

I had followed the installation steps (conda virtual environment), and installed MinkowskiEngine successfully. However, when I run python examples/example.py, I got the following error:

Traceback (most recent call last):
  File "examples/example.py", line 5, in <module>
    import MinkowskiEngine as ME
  File "/root/anaconda3/envs/pytorch1.1/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngine/__init__.py", line 12, in <module>
    from SparseTensor import SparseTensor
  File "/root/anaconda3/envs/pytorch1.1/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngine/SparseTensor.py", line 6, in <module>
    from MinkowskiCoords import CoordsKey, CoordsManager
  File "/root/anaconda3/envs/pytorch1.1/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngine/MinkowskiCoords.py", line 4, in <module>
    import MinkowskiEngineBackend as MEB
ImportError: /root/anaconda3/envs/pytorch1.1/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngineBackend.cpython-36m-x86_64-linux-gnu.so: undefined symbol: cblas_dgemm

It seems something wrong or missing related to cblas_dgemm. Could you suggest me how to fix it?

THX!

PS.
my system config:
ubuntu 16.04
python 3.6
pytorch 1.1

Typo in example.py

https://github.com/StanfordVL/MinkowskiEngine/blob/7e116d819e03ad7abddbab48f1436497e2fc5fb4/examples/example.py#L30

I was getting an error at the line, seems like it's supposed to be from common import data_loader?

slow compared to spconv

I build the same model using spconv and MinkowskiEngine.
self.middle_conv = nn.Sequential(
MinkowskiConvolution(num_input_features, 16, kernel_size=3, dimension=3),
MinkowskiBatchNorm(16),
MinkowskiReLU(),
MinkowskiConvolution(16, 16, kernel_size=3, dimension=3),
MinkowskiBatchNorm(16),
MinkowskiReLU(),
MinkowskiConvolution(16, 32, kernel_size=3, stride=2, dimension=3),
MinkowskiBatchNorm(32),
MinkowskiReLU(),
MinkowskiConvolution(32, 32, kernel_size=3, dimension=3),
MinkowskiBatchNorm(32),
MinkowskiReLU(),
MinkowskiConvolution(32, 32, kernel_size=3, dimension=3),
MinkowskiBatchNorm(32),
MinkowskiReLU(),
MinkowskiConvolution(32, 64, kernel_size=3, stride=2, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, stride=2, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=3, dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
MinkowskiConvolution(64, 64, kernel_size=(3,1,1), stride=(2,1,1), dimension=3),
MinkowskiBatchNorm(64),
MinkowskiReLU(),
)
and measure the time. The spconv one is about 0.2s, however, the minkowskiengine is about 3s. Is that normal?

python setup.y install error

Environment:
OS: Ubuntu 16.04
Cuda version: 9.0.176
CuDNN version: 7.0
Python version: 3.6.8
GCC version: 5.4.0

Error:
running install
running bdist_egg
running egg_info
writing MinkowskiEngine.egg-info/PKG-INFO
writing dependency_links to MinkowskiEngine.egg-info/dependency_links.txt
writing requirements to MinkowskiEngine.egg-info/requires.txt
writing top-level names to MinkowskiEngine.egg-info/top_level.txt
reading manifest file 'MinkowskiEngine.egg-info/SOURCES.txt'
writing manifest file 'MinkowskiEngine.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'MinkowskiEngineBackend' extension
gcc -pthread -B /home/thanhnv/anaconda3/envs/tensorflow_gpu/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./ -I/home/thanhnv/anaconda3/envs/tensorflow_gpu/include/python3.6m/.. -I/home/thanhnv/.local/lib/python3.6/site-packages/torch/include -I/home/thanhnv/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/thanhnv/.local/lib/python3.6/site-packages/torch/include/TH -I/home/thanhnv/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/thanhnv/anaconda3/envs/tensorflow_gpu/include/python3.6m -c pybind/minkowski.cpp -o build/temp.linux-x86_64-3.6/pybind/minkowski.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MinkowskiEngineBackend -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
g++ -pthread -shared -B /home/thanhnv/anaconda3/envs/tensorflow_gpu/compiler_compat -L/home/thanhnv/anaconda3/envs/tensorflow_gpu/lib -Wl,-rpath=/home/thanhnv/anaconda3/envs/tensorflow_gpu/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.6/pybind/minkowski.o -Lobjs -L/usr/local/cuda/lib64 -lminkowski -lopenblas -lcudart -o build/lib.linux-x86_64-3.6/MinkowskiEngineBackend.cpython-36m-x86_64-linux-gnu.so
/home/thanhnv/anaconda3/envs/tensorflow_gpu/compiler_compat/ld: cannot find -lminkowski
collect2: error: ld returned 1 exit status
error: command 'g++' failed with exit status 1
Does anyone know which cause this error and how to fix?

error when use convolution after generating new coords

Hi, I got an error RuntimeError: src/coords_kernelmaps.hpp:50, assertion (existsCoordsKey(in_coords_key) and existsCoordsKey(out_coords_key)) faild. The coords map doesn't exist for the given coords_key. in_coords_key: 1681692777 , out_coords_key: 592599417076398501 when using convolution after generating new coords.

This can be reproduced by modifying the test code in tests/pruning.py:

conv_tr1 = MinkowskiConvolutionTranspose(
            channels[0],
            channels[1],
            kernel_size=3,
            stride=2,
            generate_new_coords=True,
            dimension=D).double()
conv1 = MinkowskiConvolution(channels[1], channels[1], kernel_size=3, dimension=D).double()


out1 = conv_tr1(input)
out1 = conv1(out1)

Compilation issue.

Hello, thank you for your sharing. I followed the instruction but still got the following error
src/common.hpp:40:10: fatal error: cublas_v2.h: No such file or directory . I have tried update the CUDA_DIR in Makefile to /usr/local/cuda but it still shows the same error message.

RuntimeError: an illegal memory access was encountered at src/convolution.cu:259 device number specification and assertion

I can run my code with GTX1060
but it doesn't works with Tesla P100 on server
Ant output:

Traceback_ (most recent call last):
File "train.py", line 135, in
trainer.train(epoch)

File "train.py", line 56, in train
output_sparse = self.model(point)

File "/home/gaoqiyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)

File "/home/gaoqiyu/PointCloudSeg_Minkowski/model/minkunet.py", line 122, in forward
out = self.conv0p1s1(x)

File "/home/gaoqiyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)

File "/home/gaoqiyu/PointCloudSeg_Minkowski/MinkowskiEngine/MinkowskiConvolution.py", line 269, in forward
out_coords_key, input.coords_man)

File "/home/gaoqiyu/PointCloudSeg_Minkowski/MinkowskiEngine/MinkowskiConvolution.py", line 91, in forward
ctx.coords_man.CPPCoordsManager)
RuntimeError: an illegal memory access was encountered at src/convolution.cu:259

Can you provide a pre-trained model on S3DIS dataset?

Can you provide a pre-trained model on S3DIS dataset?Thank you.

multi-gpu training got OOM

i can use batch_size=8 when training in single-gpu but got OOM when trainging with 4 gpus with batch_size=1, the multi-gpu trainging code is just follow the examples/multigpu.py, i am not sure what the problem lying, could you give me some suggestions about it?

Adaptive AvgPool

Could you support AdaptiveAvgPool similar to nn.AdaptiveAvgPool2d?

TS-CRF

I trained ResUNetIN14 with Adam(lr=0.1,weight_decay=1e-4,) on S3DIS fold1 split
but it only achieve 30%mIOU......
Then I test the trained model and found points on the same object have multiple labels
I think maybe the reason is no CRF?
sorry I didn't found out where the implement of TS-CRF in code

MinkowskiEngineBackend.cpython-37m-x86_64-linux-gnu.so: undefined symbol: cusparseDcsrmv for Pytorch v1.3

system: Ubuntu18.04.3 LTS 64x
memory: 15.6 GiB
cpu:Intel® Core™ i5-9400F CPU @ 2.90GHz × 6
gpu:GeForce RTX 2060/PCIe/SSE2
cuda: 10.1
python:3.7.4(pyenv)

python3.7/site-packages/MinkowskiEngine-0.2.7a4-py3.7-linux-x86_64.egg/MinkowskiEngineBackend.cpython-37m-x86_64-linux-gnu.so: undefined symbol: cusparseDcsrmv

I tried to add libcusparse.so path to LD_LIBRARY_PATH, but it doesn't work ,I dont know how to deal with it.

how to train in batch

Each input may have different sparsity, so COO representation may have different number of coordinates. How to concatenate them into batches? Shall I allocate the same space for each input?

Thanks for your attention!

Pybind error when compilation and installation.

/root/miniconda3/envs/py3-mink/lib/python3.7/site-packages/torch/include/pybind11/pybind11.h:1404:26: error: no matching function for call to ‘pybind11::cpp_function::cpp_function(pybind11::detail::enum_base::init(bool, bool)::<lambda(pybind11::handle)>, pybind11::is_method)’
}, is_method(m_base)
^

Env:

Python 3.7
CUDA 9.0

>>> import torch
>>> print(torch.__version__)
1.2.0

How to put SparseTensor from GPU back to CPU

C++ version problem

hi, when I compile it on the server, I face another problem.
I think maybe the C++ version problem.
Because usually I compile it with /usr/include/c++/7 and it works, but this time it uses /usr/include/c++/6.
I found there exists both c++/6 and c++/7 in /usr/include on server
here is the output of the terminal
https://drive.google.com/open?id=1ONYSu36mpwCCw95iHAzf5JQlgatVV5sZ

If it is the problem of c++ version, what should I do to modify?

Could you support MSE loss?

Hi,

MSE loss in pytorch cannot be directly applied since your sparse tensor shape utilizes the feature vector's shape. Subtraction might cause dimension mismatching or unexpected broadcasting.

Could you help support this feature? Thanks!

How to Scatter the sparse output to a dense tensor?

Importing Open3D before torch leads to core dump in examples/indoor.py

https://pastebin.com/bKJYeFgT

MinkowskiNet not learning anything on ModelNet40 Classification

I have labeled 3D point cloud dataset much similar to ScanNet. I only have 3D points and the intensity value of that point. So I am feeding the intensity as a feature and the 3D points as the coordinates.
Here's the code I am using to run the network:
`
criterion = nn.CrossEntropyLoss()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = ResNet14(in_channels=1, out_channels=40, D=3).cuda()

model = vgg16(input_channel=1, num_classes=40, D=3).cuda()

optimizer = Adam(model.parameters(), lr=1e-1)

trainIterator = data.train_cls()
validIterator = data.valid()
print('#parameters', sum([x.nelement() for x in model.parameters()]))
for epoch in range(10):
model.train()
i = 0
start = time.time()
for batch in trainIterator:
optimizer.zero_grad()

    coords = batch[0]
    feat = batch[1]
    label = batch[2].to(device)

    sinput = ME.SparseTensor(feat, coords=coords).to(device)
    predictions = model(sinput)
    loss = criterion(predictions.F, label)
    i += 1
    print('Iteration: ', i, ', Loss: ', loss.item())
    loss.backward()
    optimizer.step()
torch.save(epoch, 'epoch.pth')
torch.save(model.state_dict(), 'model.pth')`

Here is the data preparation code:
class dataset_valid_cls(torch.utils.data.Dataset): def __init__(self): l1 = glob.glob('/home/zhang/桌面/SparseConvNet-master/data/val_pts/*.pts') self.path = l1 print(len(self.path)) def __len__(self): return len(self.path) def __getitem__(self, ind): curpath = self.path[ind] data = load_file(curpath,0.001) return data def load_file(file_name, voxel_size): coords = np.loadtxt(file_name) feats = np.ones((coords.shape[0], 1)) labels = np.loadtxt(file_name[:-3] + 'label').astype('int64') discrete_coords, unique_feats = ME.utils.sparse_quantize( coords=coords, feats=feats, labels=None, hash_type='ravel', set_ignore_label_when_collision=False, quantization_size=voxel_size) return discrete_coords, unique_feats, labels def collation_fn(data_labels): coords, feats, labels = list(zip(*data_labels)) coords_batch, feats_batch, labels_batch = [], [], [] for batch_id, _ in enumerate(coords): N = coords[batch_id].shape[0] coords_batch.append( torch.cat((torch.from_numpy(coords[batch_id]).int(), torch.ones(N, 1).int() * batch_id), 1)) feats_batch.append(torch.from_numpy(feats[batch_id])) labels_batch.append(torch.from_numpy(labels[batch_id])) # Concatenate all lists coords_batch = torch.cat(coords_batch, 0).int() feats_batch = torch.cat(feats_batch, 0).float() labels_batch = torch.tensor(labels_batch).long() return coords_batch, feats_batch, labels_batch def train_cls(): return torch.utils.data.DataLoader(dataset_train_cls(), batch_size=20, collate_fn=collation_fn, num_workers=0, shuffle=True)

MinkowskiNet not learning anything on ModelNet40 Classification

I try to classify ModelNet40 using ResNet, but it seems to learn nothing and the loss can not go down. Here's the code I am using to run the network:

criterion = nn.CrossEntropyLoss()
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = ResNet14(in_channels=1, out_channels=40, D=3).cuda()
# model = vgg16(input_channel=1, num_classes=40, D=3).cuda()
optimizer = Adam(model.parameters(), lr=1e-1)

trainIterator = data.train_cls()
validIterator = data.valid()
print('#parameters', sum([x.nelement() for x in model.parameters()]))
for epoch in range(10):
    model.train()
    i = 0
    start = time.time()
    for batch in trainIterator:
        optimizer.zero_grad()

        coords = batch[0]
        feat = batch[1]
        label = batch[2].to(device)

        sinput = ME.SparseTensor(feat, coords=coords).to(device)
        predictions = model(sinput)
        loss = criterion(predictions.F, label)
        i += 1
        print('Iteration: ', i, ', Loss: ', loss.item())
        loss.backward()
        optimizer.step()
    torch.save(epoch, 'epoch.pth')
    torch.save(model.state_dict(), 'model.pth')

The data prepration code is:

    def train_cls():
        return torch.utils.data.DataLoader(dataset_train_cls(),
                                           batch_size=20,
                                           collate_fn=collation_fn,
                                           num_workers=0,
                                           shuffle=True)


    def collation_fn(data_labels):
        coords, feats, labels = list(zip(*data_labels))
        coords_batch, feats_batch, labels_batch = [], [], []

        for batch_id, _ in enumerate(coords):
            N = coords[batch_id].shape[0]

            coords_batch.append(
                torch.cat((torch.from_numpy(coords[batch_id]).int(),
                           torch.ones(N, 1).int() * batch_id), 1))
            feats_batch.append(torch.from_numpy(feats[batch_id]))
            labels_batch.append(torch.from_numpy(labels[batch_id]))

        # Concatenate all lists
        coords_batch = torch.cat(coords_batch, 0).int()
        feats_batch = torch.cat(feats_batch, 0).float()
        labels_batch = torch.tensor(labels_batch).long()

        return coords_batch, feats_batch, labels_batch


    def load_file(file_name, voxel_size):
        coords = np.loadtxt(file_name)
        feats = np.ones((coords.shape[0], 1))
        labels = np.loadtxt(file_name[:-3] + 'label').astype('int64')
        discrete_coords, unique_feats = ME.utils.sparse_quantize(
            coords=coords,
            feats=feats,
            labels=None,
            hash_type='ravel',
            set_ignore_label_when_collision=False,
            quantization_size=voxel_size)
        return discrete_coords, unique_feats, labels


    class dataset_train_cls(torch.utils.data.Dataset):
        def __init__(self):
            l1 = glob.glob('/home/zhang/桌面/SparseConvNet-master/data/train_pts/*.pts')
            self.path = l1
            print(len(self.path))

        def __len__(self):
            return len(self.path)

        def __getitem__(self, ind):
            curpath = self.path[ind]
            data = load_file(curpath, 0.001)
            return data

The batches during training

Hi. Thanks for releasing the code.

I am wondering whether the network can be trained using multi batches with one GPU, instead of n batches for n GPUs.

Looking for your reply.

Performance of MultiGPU

what's the performance of multigpu (such as using distributed data parallel ) compared to single gpu training?

RuntimeError: CUDA error: invalid configuration argument in pytorch batch normalization on GPUs with VRAM>16G

I'm getting this error on a Tesla V100 (CUDA 10.0) card I don't see on 1080Ti (CUDA 9.0) with same code.

<ipython-input-10-0312a913726b> in <module>
     17         #batch['y']=batch['y'].cuda()
     18         labels = torch.cat(batch[2]).long().cuda()
---> 19         predictions = unet(input)
     20         loss = criterion(predictions.F, labels)
     21         #loss = criterion(predictions, labels)

~/miniconda3/envs/torch10/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/chei-ml/plots/fr3/2013/all_labels/examples/minkunet.py in forward(self, x)
    122     def forward(self, x):
    123         out = self.conv0p1s1(x)
--> 124         out = self.bn0(out)
    125         out_p1 = self.relu(out)
    126 

~/miniconda3/envs/torch10/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

~/miniconda3/envs/torch10/lib/python3.7/site-packages/MinkowskiEngine-0.2.8-py3.7-linux-x86_64.egg/MinkowskiEngine/MinkowskiNormalization.py in forward(self, input)
     56 
     57     def forward(self, input):
---> 58         output = self.bn(input.F)
     59         return SparseTensor(
     60             output,

~/miniconda3/envs/torch10/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

~/miniconda3/envs/torch10/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py in forward(self, input)
     70             # TODO: if statement only here to tell the jit to skip emitting this when it is None
     71             if self.num_batches_tracked is not None:
---> 72                 self.num_batches_tracked += 1
     73                 if self.momentum is None:  # use cumulative moving average
     74                     exponential_average_factor = 1.0 / float(self.num_batches_tracked)

RuntimeError: CUDA error: invalid configuration argument

Example code for 4D MinkUNet

Could you provide example code for 4D MinkUNet? Like how to prepare data for 4D computation and other details?

Plan for Tensorflow version?

Hi,

Are you planning to implement these also as custom layers for Tensorflow?

Best,

scannet training and inference code

Is it possible to make training and inference code for scannet dataset public? It will be very helpful.

MinkowskiNet not learning anything on my 3D dataset

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# Define a model and load the weights
model = MinkUNet34C(1, data.N_CLASSES).to(device)

training_epochs=512
optimizer = optim.Adam(model.parameters())

for epoch in range(training_epoch, training_epochs+1):
    model.train()
    start = time.time()
    train_loss=0
    for i,batch in enumerate(data.train_data_loader):
        start2 = time.time()
        # Get new data
        coords = batch['x'][0]
        feat = batch['x'][1]
        label = batch['y']
        inputs = ME.SparseTensor(feat, coords=coords).to(device)
        label = label.to(device)
        
        # Forward
        output = model(inputs)        
        predictions = output.F
        # Measure loss
        loss = torch.nn.functional.cross_entropy(predictions,label)
        train_loss+=loss.item()
        # Back propagate
        loss.backward()
        optimizer.step()
        del loss
        del predictions
        del output
        if i%100 == 0:
            log_string('Batch_processed = %03d / %03d,  time = %f' % ( i, data.len_train//data.batch_size, time.time()-start2))
    log_string('%03d  Train loss %f,  time = %f s' % (epoch, train_loss/(i+1), time.time() - start))
    del train_loss

Not to mention that this is pretty slow. However, even after waiting a few days, it is clear that the network is not learning anything.

Support for float16

Hi,

Does MinkowskiEngine support float16(half) type and NVIDIA/apex?

I want to use NVIDIA/apex to save memory use, but it seems MinkowskiEngine doesn't support float16(half) computation. I've tried instantiate convolution function in src/convolution.cu with type half but it couldn't pass compilation.

Could you officially support float16, which could (according to NVIDIA) save nearly half GPU memory use?

undefined symbol: _ZN3c1019UndefinedTensorImpl10_singletonE

I follow the installation instruction which goes smoothly. Then while I try to run example.py, an error reported. Below is the error message:

Traceback (most recent call last): File "examples/example.py", line 28, in <module> import MinkowskiEngine as ME File "/media/shengjie/other/IndoorSemanticIns/MinkowskiEngine/MinkowskiEngine/__init__.py", line 35, in <module> from SparseTensor import SparseTensor File "/media/shengjie/other/IndoorSemanticIns/MinkowskiEngine/MinkowskiEngine/SparseTensor.py", line 29, in <module> from MinkowskiCoords import CoordsKey, CoordsManager File "/media/shengjie/other/IndoorSemanticIns/MinkowskiEngine/MinkowskiEngine/MinkowskiCoords.py", line 27, in <module> import MinkowskiEngineBackend as MEB ImportError: /home/shengjie/anaconda3/envs/py36/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngineB

It's quite similar to the issue #1, but slightly different. I really appreciate if you can resolve the issue.

GlobalMaxPooling

hi Chirs,

Do you have a plan to support global max pooling?

GPU Memory leak in MinkowskiSumPooling

It seems to learn nothing for classification of ModelNet40 via ResNet

I try to classify ModelNet40 using ResNet, but it seems to learn nothing and the loss can not go down. Here's the code I am using to run the network:

`-----------trainning code-----------
criterion = nn.CrossEntropyLoss()

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = ResNet14(in_channels=1, out_channels=40, D=3).cuda()
# model = vgg16(input_channel=1, num_classes=40, D=3).cuda()
optimizer = Adam(model.parameters(), lr=1e-1)

trainIterator = data.train_cls()
validIterator = data.valid()
print('#parameters', sum([x.nelement() for x in model.parameters()]))
for epoch in range(10):
    model.train()
    i = 0
    start = time.time()
    for batch in trainIterator:
        optimizer.zero_grad()

        coords = batch[0]
        feat = batch[1]
        label = batch[2].to(device)

        sinput = ME.SparseTensor(feat, coords=coords).to(device)
        predictions = model(sinput)
        loss = criterion(predictions.F, label)
        i += 1
        print('Iteration: ', i, ', Loss: ', loss.item())
        loss.backward()
        optimizer.step()
    torch.save(epoch, 'epoch.pth')
    torch.save(model.state_dict(), 'model.pth')

Here is the data preparation code:
-----------data preparation code-----------
class dataset_train_cls(torch.utils.data.Dataset):
def init(self):
l1 = glob.glob('/home/zhang/桌面/SparseConvNet-master/data/train_pts/*.pts')
self.path = l1
print(len(self.path))
def len(self):
return len(self.path)
def getitem(self, ind):
curpath = self.path[ind]
data = load_file(curpath,0.001)
return data

def load_file(file_name, voxel_size):
coords = np.loadtxt(file_name)
feats = np.ones((coords.shape[0], 1))
labels = np.loadtxt(file_name[:-3] + 'label').astype('int64')
discrete_coords, unique_feats = ME.utils.sparse_quantize(
coords=coords,
feats=feats,
labels=None,
hash_type='ravel',
set_ignore_label_when_collision=False,
quantization_size=voxel_size)
return discrete_coords, unique_feats, labels

def collation_fn(data_labels):
coords, feats, labels = list(zip(*data_labels))
coords_batch, feats_batch, labels_batch = [], [], []

for batch_id, _ in enumerate(coords):
    N = coords[batch_id].shape[0]

    coords_batch.append(
        torch.cat((torch.from_numpy(coords[batch_id]).int(),
                   torch.ones(N, 1).int() * batch_id), 1))
    feats_batch.append(torch.from_numpy(feats[batch_id]))
    labels_batch.append(torch.from_numpy(labels[batch_id]))

# Concatenate all lists
coords_batch = torch.cat(coords_batch, 0).int()
feats_batch = torch.cat(feats_batch, 0).float()
labels_batch = torch.tensor(labels_batch).long()

return coords_batch, feats_batch, labels_batch

def train_cls():
return torch.utils.data.DataLoader(dataset_train_cls(),
batch_size=20,
collate_fn=collation_fn,
num_workers=0,
shuffle=True)`

How to use CPU-only MinkowskiEngine

Hello, is it possible to use CPU-only MinkowskiEngine?
I want to use Sparse Convolution on my Macbook which has no CUDA.

When I tried to install MinkowskiEngine, I met this error.
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

Thanks!

GPU problem

when I use only one gpu but not the cuda:0 device
it always print this :

Traceback (most recent call last):
File "train.py", line 208, in
trainer.train(epoch)
File "train.py", line 81, in train
output_sparse = self.model(point)
File "/home/gaoqiyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/gaoqiyu/PointCloudSeg_Minkowski/model/res16unet.py", line 197, in forward
out = self.conv0p1s1(x)
File "/home/gaoqiyu/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/gaoqiyu/PointCloudSeg_Minkowski/MinkowskiEngine/MinkowskiConvolution.py", line 270, in forward
out_coords_key, input.coords_man)
File "/home/gaoqiyu/PointCloudSeg_Minkowski/MinkowskiEngine/MinkowskiConvolution.py", line 91, in forward
ctx.coords_man.CPPCoordsManager)
RuntimeError: an illegal memory access was encountered at src/convolution.cu:259

I wonder if this is the problem of my system?

Multi-GPU training performance issue

Hi,

I found that time consuming to train a model with multi gpus is totally the same as that with a single gpu.

And I did some experiments with demo examples/multigpu.py:

for i in range(10): 
    optimizer.zero_grad()
    # Get new data
    inputs, labels = [], []
    for j in range(num_devices):
        coords, feat, label = data_loader()
        #print(coords)
        inputs.append(ME.SparseTensor(feat, coords=coords).to(devices[j]))
        labels.append(label.to(devices[j]))

    # The raw version of the parallel_apply
    t = time.time()
    replicas = parallel.replicate(net, devices)
    outputs = parallel.parallel_apply(replicas, inputs, devices=devices)
    # Time consumed
    print(time.time() - t, 's')
    
    #print(outputs)
    # Extract features from the sparse tensors to use a pytorch criterion
    out_features = [output.F for output in outputs]
    losses = parallel.parallel_apply(
        criterions, tuple(zip(out_features, labels)), devices=devices)
    loss = parallel.gather(losses, target_device, dim=0).mean()

    # Gradient
    loss.backward()
    optimizer.step()

When I run the code with 2 gpus, the output is as belows:

4.883297681808472 s
0.00571441650390625 s
0.005162477493286133 s
0.004988670349121094 s
0.004297733306884766 s
0.00394439697265625 s
0.004381656646728516 s
0.003942728042602539 s
0.004689216613769531 s
0.004220247268676758 s

While I run it with 8 gpus:

14.508163213729858 s
0.01927471160888672 s
0.017940759658813477 s
0.017798900604248047 s
0.017972707748413086 s
0.018915414810180664 s
0.020047664642333984 s
0.018342256546020508 s
0.01836872100830078 s
0.018011808395385742 s

I thought the time consumed with 2 gpus might approximate to that with 8 gpus because calculation should be paralleled. However the result is conflict to my expecation.

Is there a bug or other something?

-D_GLIBCXX_USE_CXX11_ABI=1 and _ZNK13CoordsManagerILh5EiE8toStringB5cxx11Ev

This is for the future reference.

If you encounter a compilation issue like: undefined symbol: _ZNK13CoordsManagerILh5EiE8toStringB5cxx11Ev. This is caused due to the ABI flag mismatch.

In the previous versions of the MinkowskiEngine, we assume that gcc would be a higher version than the one compiled pytorch binary. However, since pytorch v1.1 now supports easier ABI flag access. https://github.com/pytorch/pytorch/blob/v1.1.0/torch/utils/cpp_extension.py#L392, it is easier to set the correct ABI flag and the commit ebe077c dynamically get the correct ABI flag.

Compilation issue

I followed these commands to create and install the necessary requirements for the MinkowskiNet:
conda create -n py3-mink python=3.7 anaconda conda activate py3-mink conda install openblas numpy conda install -c bioconda google-sparsehash conda install pytorch torchvision -c pytorch

Then I follow these to compile the program:
conda activate py3-mink git clone https://github.com/StanfordVL/MinkowskiEngine.git cd MinkowskiEngine python setup.py install

The first error I face is in this line while installing openbias:
conda install openblas numpy

I can get around this error by installing openbias through: conda install -c anaconda openblas

However once I run this command: python setup.py install
I get the following error:

I am not sure how to resolve this. Any help would be greatly appreciated.

Thanks

Understanding the minkunet code

Hello,
I was going through the MinkUnet code and I realized that in examples/minkunet.py in the class MinkUNetBase there is a function on line 50:
def network_initialization(self, in_channels, out_channels, D):

I failed to see where this function is being called when running the architecture. In fact, I believe the ResNetBase is the one being initialized because of line 48:
ResNetBase.__init__(self, in_channels, out_channels, D)

Infact, I also fail to see why the ResNetBase is even required in MinkUNetBase(ResNetBase) class. What are we using from the ResNetBase that we actually need to initialize ResNetBase?

Can someone please take a few minutes to explain to me the working of the code here and why ResNetBase is even needed in this code. More importantly, where is the network_initialization function being called? if this function at line 50 is not being used, then the MinkUNet34C wouldn't work as described by the variables.
Thanks.

Using PyTorchs DataSet, DataLoader, and increasing the Batch Size.

Thank you for your work! In particular, for open sourcing the source code!

My question is: How can I use PyTorch DataSet and DataLoader class to increase the Batch Size? (If it is possible at all)

As far as I understand, I simply have to append the mini batch index to the coordinates matrix according to this. In your example code, you show how this is done using a function data_loader common.py. However, I wonder if this is also possible the standard PyTorch way by defining a DataSet class that inherits from PyTorchs DataSet class and using their DataLoader.

Thank you in advance!

RuntimeError: CUDA error: no kernel image is available for execution on the device

Hi, I try this with aws server because my personal computer doesn't have enough memory.

I use this inside docker. I compiled successfully.

g++ version:
g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

Cuda version:
release 10.1, V10.1.243

error:
Traceback (most recent call last): File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/workspace/share/Projects/ThanhNV/MinkowskiEngine/examples/example.py", line 80, in <module> output = net(input) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in __call__ result = self.forward(*input, **kwargs) File "/workspace/share/Projects/ThanhNV/MinkowskiEngine/examples/example.py", line 56, in forward return self.net(x) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in __call__ result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in __call__ result = self.forward(*input, **kwargs) File "/workspace/share/Projects/ThanhNV/MinkowskiEngine/MinkowskiEngine/MinkowskiNormalization.py", line 58, in forward output = self.bn(input.F) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 539, in __call__ result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 72, in forward self.num_batches_tracked += 1 RuntimeError: CUDA error: no kernel image is available for execution on the device
pytorch version:
'1.2.0a0+e6a7071

When i install this library on docker, i also use this command: apt install libgl1-mesa-glx because default open3d cannot be imported.

Does anyone have any ideas about this problem?
Thank you.

Sparse Convolution Implementation

Hi,
Thank you so much for your code.

I have several questions regarding to the sparse conv and submanifold conv. According to my understanding on Submanifold Sparse Convolutional Networks, sparse conv should have the same output like standard conv but sparse conv may bring computational benefits.

Does MinkowskiConvolution use submanifold convolution as default?
How would you implement the sparse conv since I didn't find any related code? For example, let out_coords_key include in_coords_key's neighborhood?

Using multiple features for one coordinate

I found that ME provided ME.utils.sparse_quantize() to downsample a dense coord array in order to avoid duplications of coords.
But I prefer to preserve the original nums of points from a poindcloud.I just take a rounding operation on the coords.So there may exsit some same coords.
How can I solve the problem？

Traceback (most recent call last):
File "train.py", line 130, in
sinput = ME.SparseTensor(features, coords=coordinates).to(device)
File "/home/stevencui/anaconda3/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngine/SparseTensor.py", line 119, in init
coords_manager.initialize(coords, coords_key)
File "/home/stevencui/anaconda3/lib/python3.6/site-packages/MinkowskiEngine-0.2.4-py3.6-linux-x86_64.egg/MinkowskiEngine/MinkowskiCoords.py", line 51, in initialize
enforce_creation)
ValueError: A duplicate key found. Existing coord: [97, 273, 49, 0], new coord: : [97, 273, 49, 0]. If the duplication was intentional, use initialize_coords_with_duplicates.

usage of MinkowskiConvolutionTranspose when given output coords

It seems MinkowskiConvolutionTranspose supports specifying coordinates for the output, but I do not find a way to pass the output coordinates.

As the document says, we need to pass out_coords_key to the function, but the out_coords_key seems only contains dimension and tensor stride information, and does not contain coordinates information, am I right? Should we also pass the CoordsManager object to the function?

`thrust::system::system_error` when `sparse_quantize` called with label

Hi,

When I run codes below

import MinkowskiEngine as ME, MinkowskiEngine.utils as ME_utils
import numpy as np

coords = np.random.rand(10000, 3) * 100
feats = np.random.rand(10000, 3)
labels = np.ones((10000, 1))
print(ME_utils.sparse_quantize(coords, feats=feats, labels=labels))

I got error:

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  parallel_for failed: invalid argument

But code above works fine after I reduced the size of coords (100, such as) or just removed the argument 'labels'.

Is there any suggestion?

THX!

Are the coordinates (i.e geometric information) automatically used as features?

Thank you for your work and making it public!

My question relates to the SparseTensor. In the provided example you wrote

sinput = ME.SparseTensor(features - 0.5, coords=coordinates).to(device)

where features are the colors, i.e. Red, Green, and Blue. Am I correct when I say that this does not take the coordinates into account as features? In other words, If I want to include the geometric information I have to include them as features? I'm asking because you seem not to normalize the coordinates.

Thank you in advance!

Neither CPU nor GPU usage at 100%

Hello,
Thanks for your great implementation. I wanted to know what the bottleneck is in the training. In my example the MinkowskiEngine uses the 20 cores of my CPU, but the usage is always pretty low. Also the GPU is not at 100%.

I wonder if it would make sense to use multiple workers in the dataloader. Right now I get an error when activating multiprocessing in the dataloader with num_workers > 0, because SparseTensor already uses multiprocessing. To fix this, would it be possible to split the CPU cores differently? For instance, one could use 2 workers and each worker uses 10 cores to construct the SparseTensor.

Do you think a speedup would be possible like this? I found that the number of threads is hidden in the CoordsManager, but maybe it could be made available in the SparseTensor arguments.

Thanks

GPU Memory growth rapidly when traing with MinkUnet

I modify the code for training on the KITTI dataset. The issue I met is that the GPU memory grew rapidly when I use the network in MinkUnet but it's fine with the UNet example. At the same time, the speed of the network is extremely slow, which cost about 8s per step with the simplest UNet. The batch size is 16 and the time on data loader is about 2s. The scale of voxels is [1/8,1/12,1/20], The optimizer is Adam. Is there any idea about it? I compared with the example and didn't figure out the problems. Really thanks if any advise.

My data loader:

def trainMerge(tbl):
    locs=[]
    feats=[]
    labels=[]
    for idx,i in enumerate(tbl):
        if not mem_cache:
            a,b,c=torch.load(train[i])
        else:
            a,b,c=train[i]
        m=np.eye(3)+np.random.randn(3,3)*0.1
        m[0][0]*=np.random.randint(0,2)*2-1
        a=np.matmul(a,m)
        offset=np.random.rand(3)*0.05
        a+=offset
        a = a -a.min(0)
        b = b.reshape(-1,1)
        a,b,c = ME.utils.sparse_quantize(a, feats=b, labels=c, ignore_label=0,quantization_size=scale)
        locs.append(a)
        feats.append(b+np.random.randn(1)*0.1)
        labels.append(c.astype(np.int32))
    locs, feats,labels = ME.utils.sparse_collate(locs, feats, labels = labels)
    return {'x':[feats,locs], 'y': labels.long()}
def get_train_data_loader():
    return torch.utils.data.DataLoader(
    list(range(len(train))),batch_size=batch_size, collate_fn=trainMerge, num_workers=20, shuffle=True)

Part of my train code:

for i,batch in tqdm.tqdm(enumerate(data.get_train_data_loader())):
        optimizer.zero_grad()
        input = ME.SparseTensor(batch['x'][0], coords=batch['x'][1]).to(device)
        predictions=unet(input)
        loss = criterion(predictions.F, batch['y'].to(device))
        train_loss+=loss.data#.item()
        loss.backward()
        optimizer.step()

My evnirment is 9700k, 2080ti, python 3.7, cuda 10.0 and pytorch 1.2

initialize_coords

I noticed that you have a comment in your code at
https://github.com/chrischoy/SpatioTemporalSegmentation/blob/9eb1b7b05acc92ec2b5e98f4f34c67dedcecb00c/models/resunet.py#L20

but I found you comment out ‘model.initialize_coords’at
https://github.com/chrischoy/SpatioTemporalSegmentation/blob/9eb1b7b05acc92ec2b5e98f4f34c67dedcecb00c/lib/train.py#L94

And I did't found a function named 'model.initialize_coords' and 'clear'

I'd like to know if it still need this, if so, can you tell me where it is?

ImportError:_Z8cpu_gemmIfEv11CBLAS_ORDER15CBLAS_TRANSPOSES1_iiiT_PKS2_S4_S2_PS2_

system: Ubuntu18.04.3 LTS 64x
memory: 16GB
cpu:Intel® Core™ i7-7700 CPU @ 3.60GHz × 8
gpu:GeForce GTX 1060 6GB/PCIe/SSE2
cuda: 10.1
python:3.7.4(pyenv)

ImportError: /home/user/.pyenv/versions/3.7.4/lib/python3.7/site-packages/MinkowskiEngine-0.2.7a4-py3.7-linux-x86_64.egg/MinkowskiEngineBackend.cpython-37m-x86_64-linux-gnu.so: undefined symbol: Z8cpu_gemmIfEv11CBLAS_ORDER15CBLAS_TRANSPOSES1_iiiT_PKS2_S4_S2_PS2

what should I do?