Coder Social home page Coder Social logo

jabb0 / fastflow3d Goto Github PK

View Code? Open in Web Editor NEW
31.0 4.0 5.0 6.09 MB

Implementation of the FastFlow3D architecture for scene flow estimation from LiDAR point clouds in PyTorch using PyTorch Lightning.

License: MIT License

Python 80.10% C 1.28% C++ 6.37% Cuda 11.68% Shell 0.58%
point-cloud pytorch-lightning pytorch scene-flow

fastflow3d's Introduction

FastFlow3D Implementation

This repository contains an implementation of the FastFlow3D architecture from "Scalable Scene Flow from Point Clouds in the Real World (Jund et al. 2021)" in PyTorch (with PyTorch lightning). Paper on arxiv.

As a baseline the FlowNet3D architecture by Liu et al. (2019) is implemented as well.

The repository allows to work with the Waymo dataset and its scene flow annotations as well as the FlyingThings3D dataset.

As of now the documentation is not final. For now most documentation refers to FastFlow3D on the waymo dataset. Find early development notes below and detailed comments in the code.

See USAGE for how to jump right in and RESULTS for our results.

LICENSE

See LICENSE

Cite

If you use this code in your own work, please use the following bibtex entries:

@misc{fastflow3d-pytorch-2021, 
  title={FastFlow3D-PyTorch: Implementation of the FastFlow3D scene flow architecture (Jund et. al 2021) in PyTorch.}, 
  author={Jablonski, Felix and Distelzweig, Aron and Maranes, Carlos}, 
  year={2021}, publisher={GitHub}, 
  howpublished={\url{https://github.com/Jabb0/FastFlow3D}} }

Please don't forget to cite the underlying papers as well!

Contributors

This repository is created by Aron, Carlos and Felix.

Bugs / Improvements

If you encounter any bugs or you want to suggest any improvements feel free to create a pull request/issue.

Results

Experimental Setup

Trained the network for 19 epochs (3 days) on the full waymo train set (157,283 samples) and validated on the full waymo validation set.
87% of the points in the Waymo dataset are background points with no flow.

Training is done as close as possible to the original paper but because of hardware limitations the batch size is set to 16 instead of 64. Loss function is the average L2-error in m/s over all points with downweighting of the points belonging to the background class.

Waymo Dataset Distribution

The distribution of points per label in the waymo dataset has been analyzed.

Total Samples Total Points Unlabeled Background Vehicle Pedestrian Sign Cyclist
Train 157,283 24,265M 1.03% 87.36% 10.52% 0.78% 0.28% 0.03%
Valid 39,785 6,193M 1.02% 88% 9.98% 0.71% 0.25% 0.03%

Metrics

Here we present two error metrics from the original paper. Same as Jund et. al we have used grouping based on classes (vehicle, pedestrian, background,...) to show inbalances in the performance.

mean L2 error in m/s L2 error between the 3D velocity vector for prediction and target averaged over all points. Lower is better.

<= 1.0 m/s Percentage of points that are predicted correctly up to 1.0 m/s. Higher is better.

Quantitative Results

Waymo Dataset

Comparison of "our" experiment as describe above using this code against the results reported by Jund et. al.

Note: Difference in performance are likely due to the different batch size used (16 vs. 64).

image

Usage

This repository contains different parts: Preprocessing, training and visualization.

The hardware requirements of FastFlowNet are high due to the large size of the pseudo images. Reasonable results can be expected within a few days using 4x NVIDIA Titan X GPUs with 12GB VRAM each.

Compatibility Note

Some dependencies still require Python 3.8 as the 3.9 versions are not available via pip. This is the case for open3d which is used for visualization.

Preprocessing

In order to use the data it needs to be preprocessed. During preprocessing the dataset is extracted and the important information is stored directly accessible on disc. The whole WaymoDataset has 1TB of data, the preprocessed data can be stored in 400GB. It is not necessary to use the full dataset, although recommended for best performance.

Download the waymo dataset as tfrecord files from here. You have to register into Waymo to be able to see it. Then, it should be downloaded into <raw_data_directory/train and <raw_data_directory/valid, respectively.

Start the preprocessing for train and val (and test) separately using:

python preprocess.py <raw_directory> <out_directory>

The output directory has to have a structure of <directory>/train for the training data and <directory>/valid for the validation data. If test data is available put it into <directory>/test.

Training

Start an experiment with (using Tensorboard logging):

python train.py <data_directory> <experiment_name>

Have a look at the run.sh shell scripts and the available parameters of the train.py script.

This project has been built for usage with Weights & Biases as a logging service, thus logging with WnB is supported via command line arguments.

Visualization

To create a visualization of the point clouds with ground truth and predicted data use the visualization.py script.

python visualization.py <directory>/valid <config_file> --model_path <path_to_checkpoint>>

NOTE: the current code requires a WeightsAndBiases config.yaml thus this logger needs to be used (or the code adapted).

Data

These are the points that are considered to be seen be the LiDAR: 170m x 170m grid centered at the AV represented by 512 x 512 pillars (approx. 0.33m x 0.33m pillars). For height (z-dimensions) a valid pillar range is from -3m to 3m.

WaymoDataset

It reads the WaymoDataset with the extended flow information. It can be found here.

Each of the file is a session compressed, which has to be decompressed and parsed to access to their field. A session has a number of frames and in each frame is all the information needed. The information of each frame is available here. Note that that may a field cannot be accessed in a direct way, so these functions should be extended.

Regarding general information about the fields, we are interesented in the 3D LiDAR points. The car in which this information has been logged had 3D LiDARS sensors and, per each sensor, it records the first and second return.

When calling the utils functions, we take into consideration both returns from the five LiDARs and concatenate all of them, so all of them are treated equally. main.py file includes an example on how to read the data of a frame.

More details: https://waymo.com/open/data/perception/

References

Problem Definition

  • 3D scene flow in a setting where the scene at time $t_i$ is represented as a point cloud $P_i$ as measured by a LiDAR sensor mounted on the AV.
  • Each point cloud point has a 3D motion vector (v_x, v_y, v_z). Each component gives the velocity in m/s.
  • Each point also has a label identifying its class. This is mostly used for loss weighting but some metrics depend on it too.
  • Predict the flow given two consecutive point clouds.
  • With high frame rate the calculation between two timesteps is a good approximation of the current flow.

Label Creation

The labels are already created for each point in the pointcloud. The paper states how this is done. They present a scalable automated approach bootstrapped from existing labeled, tracked objects in the LiDAR data sequences. Each object is a 3D label bounding box with a unique ID. Ego movement of the AV is removed by computing the position of each object in the previous timeframe in the current timeframe based on the AV movement since this timeframe. Not compensating for ego movement has shown to decrease performance significantly. Afterwards self movement of all the points of an object are computed. One could also have chosen not to remove the AV ego movement, but: You get movement of the object independent of the AV, better reasoning about the objects own movement. The movement of whole bounding boxes is used initially to identify the movement of their point cloud points. This includes rotations of the objects. This rotation and ego movement compensation is combined into a single transition matrix T. For each point x0 in the bounding box of an object in the current pointcloud the previous point position x-1 is computed based on this transition matrix. This, however, does not mean that the previous pointcloud actually had a point at this position, it is just a "this point has likely moved from this position and thus has this speed". Because of the point-wise approximation points in the same object can have different speeds. The label is then the change over time from the previous to the current point. This label creation can be applied to any point cloud dataset with 3D bounding boxes and tracklets.

Limitations

  • Objects are assumed to be rigid. This does not yield for pedestrians, but as the reference time is so small this is considered of minimal consequence.
  • Objects can not have a previous frame
  • Some rare moving objects do not have bounding boxes and are belonging to the "background" class without movement. This needs to be overcome.

Metrics

  • Common metrics are L2 error of the point wise flow as the label is a 3D vector.
  • Points with L2 error below a given threshold.
  • Metrics per object class as they have inherently different characteristics
  • Binary moving/not-moving classification based on a movement threshold. Threshold of 0.5 m/s is selected but not defined easily. Use standard binary classification metrics.

Architecture

Input

  • Two subsequent point clouds. Each cloud is a (N_points, 5) matrix. Each point has 5 features, 3 coordinates and 2 laser features.

Scene Encoder

  • Performed on each input point cloud
  • Every point is assigned to a fixed 512x512 grid depending on its x, y positions.
  • First each point is encoded depending on its grid cell. Each cell is a pillar in z (upwards) direction.
  1. Take each input point cloud
  2. Compute the center coordinates of all 512x512 pillars
  3. Compute the pillar that each point falls into
  4. Encode each point as 8D (pillarCenter_x, pillarCenter_y, pillarCenter_z, offset_x, offset_y, offset_z, feature_0, feature_1) with offset being the offset from the point to its pillar center and the features being the laser features of the point.
  5. For each point an embedding is computed based on its 8D encoding using an MLP
  6. Sum up the embeddings for all points in a pillar with depth 64 to get pillar embedding
  7. The final point cloud embedding is a 512x512 2D pseudo-image with depth 64

This part is not straight forward as each point cloud has a different amount of points and thus all points clouds cannot be batched. This is solved without dropping any points using a scatter-gather approach in FlowNet3D

Contextual Information Encoder

  • Convolutional Autoencoder (U-net) with first half of the architecture consists of shared weights across two frames.
  • Both inputs have the same weights in the conv net
  • The frames are not concatenated for the encoder pass. Each frame is passed on its own.
    • However to get the highest possible batch size for BatchNorm previous and current frames are passed concatenated in the batch dimension.
  • Bottleneck convolutions have been introduced.
  • The encoder consists of blocks that each reduce to a hidden size.
  • The output of each hidden size is saved.

Decoder to obtain Pointwise Flow

  • Decoder part of the conv autoencoder
  • Deconvolutions are replaced with bilinear upsampling.
  • Skip connections from the concatenated encoder embeddings at the corresponding spatial resolution.
  • Output is a 512x512 grid-structured flow embedding
  • No activation function in this part

Unpillar Operation

  • Select for each point the corresponding cell and the cells flow embedding
  • This lookup struggles with the different sized points clouds as well. A gather approach is applied to solve this issue.
  • Concatenates the point feature (the 64D embedding) of the point to predict (we predict the current timeframe t0 only, not t-1).
  • MLP to regress onto point-wise motion prediction

Implementation Notes

  • Objects that not have a previous frame cannot be predicted and this need to be removed from weight updates and scene flow evaluation metrics. Their points are still part of the input but their label is not used for training.
  • The z-axis is the height.
  • Using vertical pillarization makes sense as objects can only rotate around the z-axis in the Waymo dataset. Therefore, the a pillar is a good capture of the same points in an object.
  • Use mean L2-loss for training calculating the average error in speed over all points.
  • Ego motion is removed by a translation matrix that transforms the previous frame to the view of the current frame.
  • The original authors trained their model for 19 epochs using the Adam optimizer.
  • The original authors applied an artificial downweight of background points in the L2 loss by a factor of 0.1. This value is found using hyperparameter search. Maybe redo this with a better searcher.

General pyTorch Development Notes

Tutorials / Technical Information

Structure

Models

Full PyTorch Lightning modules that are trained.

Networks

PyTorch modules

Data

All dataloading routines and transformations.

fastflow3d's People

Contributors

arndz avatar cmaranes avatar dependabot[bot] avatar jabb0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fastflow3d's Issues

Question of the dataset

Thank you again for this outstanding work! Here I have a question about the waymo dataset version you used. I notice that the waymo dataset download path you provided in the readme.md (https://console.cloud.google.com/storage/browser/waymo_open_dataset_scene_flow) is different from the dataset of any version on the official website of waymo dataset (now latest version is waymo 1.4.0). I would like to know what is the difference between the dataset used in this work and the datasets that are open for download in the official website. Sincerely looking forward to your reply!

About the model performance

Hello @Jabb0 , thanks for your implementation!

I have some questions about the performance of this paper. After your model is trained, can the test results reach the accuracy in the paper? And could you share the test results?

Accelerator='ddp' is an invalid accelerator name

Hello again! I encounter an error when I'm trying to run the train.py, which shows that accelerator='ddp' is an invalid accelerator name. The error message is shown at the end of the issue.
My environment is:
CUDA 11.3
Python 3.10.8
PyTorch 1.12.1
PyTorch lightning 1.8.3

and I've also tried the environment setting as follows and still encounter the same problem:
CUDA 11.3
Python 3.8.13
PyTorch 1.10.0
PyTorch lightning 1.7.7

Can you kindly offer some suggestions? Thanks a lot and looking forward to your reply!

~/FastFlow3D-main$ python train.py --accelerator='ddp' --batch_size=16 --gpus=4 --num_workers=16 --learning_rate=0.0001 --disable_ddp_unused_check=True
No weights and biases API key set. Using tensorboard instead!
Disabling unused parameter check for DDP
Traceback (most recent call last):
File "/home/fjy/FastFlow3D-main/train.py", line 286, in
cli()
File "/home/fjy/FastFlow3D-main/train.py", line 263, in cli
trainer = pl.Trainer.from_argparse_args(args,
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1917, in from_argparse_args
return from_argparse_args(cls, args, **kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 66, in from_argparse_args
return cls(**trainer_kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 340, in insert_env_defaults
return fn(self, **kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 408, in init
self._accelerator_connector = AcceleratorConnector(
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 192, in init
self._check_config_and_set_final_flags(
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 291, in _check_config_and_set_final_flags
raise ValueError(
ValueError: You selected an invalid accelerator name: accelerator='ddp'. Available names are: cpu, cuda, hpu, ipu, mps, tpu.

About the metric.

Hi,
Thanks for your implementation! Now I have a question about the calculation of the metric. In your code, I find you compute the pointwise metric at each step. Then, pytorch lightning will average the metric on each step automatically to get the mean metric on epoch. In my understanding, the point-wise mentioned in the paper is performed on the entire epoch. I want to know if this will lead to some bias in evaluation.

Index out of bounds error in pillarFeatureNetScatter.py:35 (grid.scatter_add_(1, indices, x))

Thanks for providing this fastflow3d implementation here. I'm using it with a custom dataset. Some of the data in it trigger an index out of bounds error, an example trace is pasted at the end.

I think that the error happens because the upper limit of the grid (x_max, y_max, z_max) is an exclusive boundary and lidar points that fall exactly on that value are then out of bounds. For example in a 1D grid from x_min=-2 to x_max=2 with a grid_size of 4, the grid cells would contain

      0           1           2         3
[-2.0, -1.0) [-1.0, 0.0) [0.0, 1.0) [1.0, 2.0)

A point at x=2.0 (x=x_max) would fall into cell with index 4, which is out of bounds.

The easiest workaround I see is to change remove_out_of_bounds_points in utils/pillars.py to exclude the *_max values, i.e. change <= to < for x_max, y_max, z_max. This seems to fix the error for me. Does this make sense?

diff --git a/utils/pillars.py b/utils/pillars.py
index 5714c8d..88f0125 100644
--- a/utils/pillars.py
+++ b/utils/pillars.py
@@ -4,9 +4,9 @@ import numpy as np
 def remove_out_of_bounds_points(pc, y, x_min, x_max, y_min, y_max, z_min, z_max):
     # Calculate the cell id that this entry falls into
     # Store the X, Y indices of the grid cells for each point cloud point
-    mask = (pc[:, 0] >= x_min) & (pc[:, 0] <= x_max) \
-           & (pc[:, 1] >= y_min) & (pc[:, 1] <= y_max) \
-           & (pc[:, 2] >= z_min) & (pc[:, 2] <= z_max)
+    mask = (pc[:, 0] >= x_min) & (pc[:, 0] < x_max) \
+           & (pc[:, 1] >= y_min) & (pc[:, 1] < y_max) \
+           & (pc[:, 2] >= z_min) & (pc[:, 2] < z_max)
     pc_valid = pc[mask]
     y_valid = None
     if y is not None:
[...]
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [124,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [125,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [126,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [118834,0,0], thread: [127,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.              
Traceback (most recent call last):                                                                                                                                                                         
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 722, in _call_and_handle_interrupt                                                                              
    return self.strategy.launcher.launch(trainer_fn, *args, trainer=self, **kwargs)                                                                                                                        
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 93, in launch                                                                            
    return function(*args, **kwargs)                                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl                                                                                               
    results = self._run(model, ckpt_path=self.ckpt_path)                                                                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run                                                                                                   
    results = self._run_stage()                                                                                                                                                                            
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1324, in _run_stage                                                                                             
    return self._run_train()                                                                                                                                                                               
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1354, in _run_train                                                                                             
    self.fit_loop.run()                                                                                                                                                                                    
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/fit_loop.py", line 269, in advance                                                                                                  
    self._outputs = self.epoch_loop.run(self._data_fetcher)                                                                                                                                                
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 208, in advance                                                                                 
    batch_output = self.batch_loop.run(batch, batch_idx)                                                                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance                                                                                  
    outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)                                                                                                                                  
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 204, in run                                                                                                          
    self.advance(*args, **kwargs)                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 203, in advance                                                                               
    result = self._run_optimization(                                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 256, in _run_optimization                                                                     
    self._optimizer_step(optimizer, opt_idx, batch_idx, closure)                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 369, in _optimizer_step                                                                       
    self.trainer._call_lightning_module_hook(                                                                                                                                                              
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1596, in _call_lightning_module_hook                                                                            
    output = fn(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/lightning.py", line 1625, in optimizer_step                                                                                          
    optimizer.step(closure=optimizer_closure)                                                                                                                                                              
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/core/optimizer.py", line 168, in step                                                                                                     
    step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/ddp.py", line 278, in optimizer_step                                                                                           
    optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs)                                                                                                                
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/strategy.py", line 193, in optimizer_step                                                                                      
    return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 155, in optimizer_step                                                                       
    return optimizer.step(closure=closure, **kwargs)                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/torch/optim/optimizer.py", line 88, in wrapper                                                                                                              
    return func(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context                                                                                                  
    return func(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/torch/optim/adam.py", line 100, in step                                                                                                                     
    loss = closure()                                                                                                                                                                                       
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 140, in _wrap_closure                                                                        
    closure_result = closure()                                                                                                                                                                             
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 148, in __call__                                                                              
    self._result = self.closure(*args, **kwargs)                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 134, in closure                                                                               
    step_output = self._step_fn()                                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 427, in _training_step                                                                        
    training_step_output = self.trainer._call_strategy_hook("training_step", *step_kwargs.values())                                                                                                        
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1766, in _call_strategy_hook                                                                                    
    output = fn(*args, **kwargs)                                                                                                                                                                           
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/strategies/ddp.py", line 344, in training_step                                                                                            
    return self.model(*args, **kwargs)                                                                                                                                                                     
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                                                                                                                                  
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/parallel/distributed.py", line 963, in forward                                                                                                     
    output = self.module(*inputs[0], **kwargs[0])                                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                                                                                                                                  
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/overrides/base.py", line 82, in forward                                                                                                   
    output = self.module.training_step(*inputs, **kwargs)                                                                                                                                                  
  File "/workspace/FastFlow3D/models/BaseModel.py", line 167, in training_step                                                                                                                             
    loss, metrics = self.general_step(batch, batch_idx, phase)                                                                                                                                             
  File "/workspace/FastFlow3D/models/BaseModel.py", line 119, in general_step                                                                                                                              
    y_hat = self(x)                               
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                            
  File "/workspace/FastFlow3D/models/FastFlow3DModelScatter.py", line 93, in forward                                                                                                                       
    current_pillar_embeddings = self._pillar_feature_net(current_batch_pc_embedding, current_batch_grid)                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1110, in _call_impl                                                                                                       
    return forward_call(*input, **kwargs)                                                            
  File "/workspace/FastFlow3D/networks/pillarFeatureNetScatter.py", line 35, in forward                                                                                                                    
    grid.scatter_add_(1, indices, x)                                                                 
RuntimeError: CUDA error: device-side assert triggered

[W CUDAGuardImpl.h:113] Warning: CUDA warning: device-side assert triggered (function destroyEvent)                                                                                              
terminate called after throwing an instance of 'c10::CUDAError'                                                                                                                                  
  what():  CUDA error: device-side assert triggered                                                                                                                                              
Exception raised from create_event_internal at ../c10/cuda/CUDACachingAllocator.cpp:1230 (most recent call first):                                                                               
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f5becd167d2 in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10.so)                                              
frame #1: <unknown function> + 0x2319e (0x7f5becf8319e in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10_cuda.so)                                                                       
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x22d (0x7f5becf84d3d in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10_cuda.so)                                         
frame #3: <unknown function> + 0x2ffc28 (0x7f5c40051c28 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                  
frame #4: c10::TensorImpl::release_resources() + 0x175 (0x7f5beccff005 in /usr/local/lib/python3.8/dist-packages/torch/lib/libc10.so)                                                            
frame #5: std::vector<c10d::Reducer::Bucket, std::allocator<c10d::Reducer::Bucket> >::~vector() + 0x2e9 (0x7f5c2b9018d9 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_cpu.so)     
frame #6: c10d::Reducer::~Reducer() + 0x205 (0x7f5c2b8f4015 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_cpu.so)                                                                 
frame #7: std::_Sp_counted_ptr<c10d::Reducer*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() + 0x12 (0x7f5c4052f8d2 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)          
frame #8: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 0x46 (0x7f5c3ff3fbc6 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                         
frame #9: <unknown function> + 0x7e0eef (0x7f5c40532eef in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                  
frame #10: <unknown function> + 0x1f51e0 (0x7f5c3ff471e0 in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                 
frame #11: <unknown function> + 0x1f638e (0x7f5c3ff4838e in /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch_python.so)                                                                 
frame #12: python() [0x5d0147]                  
frame #13: python() [0x5a9e9d]                  
frame #14: python() [0x5d0168]                  
frame #15: python() [0x5a6152]                  
frame #16: python() [0x4ef7f8]                  
<omitting python frames>                        
frame #22: __libc_start_main + 0xf3 (0x7f5c41db70b3 in /usr/lib/x86_64-linux-gnu/libc.so.6)                                                                                                      

Aborted (core dumped)

Do you have a pre-trained model?

Do you have the model you trained on waymo available to download, or do we have to download dataset, preprocess data, and train for 3 days to get the model you achieved?

Little bug in readme.md and request for open source checkpoint file

Thanks for your excellent job and detailed tutorial! I notice that there maybe a little bug in the readme.md, since there are double "offset_y" in Architecture-Scene Encoder-4.Encode each point as 8D (pillarCenter_x, pillarCenter_y, pillarCenter_z, offset_x, offset_y, offset_y, feature_0, feature_1)
Moreover, will you kindly release the trained checkpoint file for the network? Sincerely looking forward to your reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.