Comments (7)
- use car.tiny.config (batchsize=1 use 2300MB in my environment), this config can reach 84/74/68 3D AP.
- reduce max_number_of_points_per_voxel
- reduce VFE number of filters (E.g. [32, 64] to [16, 32])
- reduce RPN conv filter numbers (E.g. [64, 128, 128] to [32, 64, 64], num_upsample_filters [128, 128, 128] to [64, 64, 64])
- reduce RPN layer numbers (E.g. [3, 5, 5] to [2, 3, 3])
If you just want to learn my code, you can set remove_environment to true, which remove all points except points in ground-truth box.
from second.pytorch.
Thank your for your fast reply!
I first switched the parameters of "car.tiny.config" to the values you wrote and remove_environment to true. Then I switched
max_number_of_points_per_voxel to 5
VFE number of filters to [8, 16]
RPN conv filter numbers to [16, 32, 32]
num_upsample_filters to [32, 32, 32]
and RPN layer numbers to [1, 2, 2]
Now the first few steps work but then I receive this. Is this still because of the "bad" GPU?
step=50, steptime=0.0568, cls_loss=3.15, cls_loss_rt=2.18, loc_loss=8.21, loc_loss_rt=4.3, rpn_acc=0.869, prec@10=0.022, rec@10=0.914, prec@30=0.0192, rec@30=0.524, prec@50=0.0209, rec@50=0.105, prec@70=0.0377, rec@70=0.0199, prec@80=0.0423, rec@80=0.00875, prec@90=0.0734, rec@90=0.00389, prec@95=0.0784, rec@95=0.00194, loss.loc_elem=[0.239, 0.304, 0.417, 0.319, 0.198, 0.252, 0.42], loss.cls_pos_rt=0.219, loss.cls_neg_rt=1.96, loss.dir_rt=0.883, num_vox=1019, num_pos=62, num_neg=2866, num_anchors=3011, lr=0.0002, image_idx=3245
step=100, steptime=0.0398, cls_loss=2.15, cls_loss_rt=1.21, loc_loss=6.27, loc_loss_rt=3.73, rpn_acc=0.924, prec@10=0.0221, rec@10=0.882, prec@30=0.0196, rec@30=0.409, prec@50=0.0209, rec@50=0.0531, prec@70=0.0377, rec@70=0.00985, prec@80=0.0423, rec@80=0.00432, prec@90=0.0734, rec@90=0.00192, prec@95=0.0784, rec@95=0.000961, loss.loc_elem=[0.195, 0.244, 0.398, 0.176, 0.23, 0.185, 0.439], loss.cls_pos_rt=0.25, loss.cls_neg_rt=0.956, loss.dir_rt=0.775, num_vox=1123, num_pos=69, num_neg=3284, num_anchors=3455, lr=0.0002, image_idx=2605
step=150, steptime=0.058, cls_loss=1.73, cls_loss_rt=0.588, loc_loss=5.15, loc_loss_rt=2.96, rpn_acc=0.944, prec@10=0.0225, rec@10=0.881, prec@30=0.02, rec@30=0.33, prec@50=0.0212, rec@50=0.0341, prec@70=0.0377, rec@70=0.00624, prec@80=0.0423, rec@80=0.00274, prec@90=0.0734, rec@90=0.00122, prec@95=0.0784, rec@95=0.000609, loss.loc_elem=[0.262, 0.166, 0.325, 0.107, 0.273, 0.0865, 0.259], loss.cls_pos_rt=0.537, loss.cls_neg_rt=0.0517, loss.dir_rt=0.983, num_vox=155, num_pos=18, num_neg=676, num_anchors=717, lr=0.0002, image_idx=2652
step=200, steptime=0.064, cls_loss=1.44, cls_loss_rt=0.462, loc_loss=4.43, loc_loss_rt=1.78, rpn_acc=0.953, prec@10=0.0231, rec@10=0.876, prec@30=0.0208, rec@30=0.265, prec@50=0.0212, rec@50=0.0255, prec@70=0.0377, rec@70=0.00466, prec@80=0.0423, rec@80=0.00205, prec@90=0.0734, rec@90=0.00091, prec@95=0.0784, rec@95=0.000455, loss.loc_elem=[0.088, 0.0605, 0.31, 0.0378, 0.0782, 0.0535, 0.26], loss.cls_pos_rt=0.315, loss.cls_neg_rt=0.147, loss.dir_rt=0.698, num_vox=691, num_pos=38, num_neg=1802, num_anchors=1901, lr=0.0002, image_idx=4313
step=250, steptime=0.0543, cls_loss=1.25, cls_loss_rt=0.553, loc_loss=3.94, loc_loss_rt=1.7, rpn_acc=0.957, prec@10=0.024, rec@10=0.881, prec@30=0.021, rec@30=0.217, prec@50=0.0212, rec@50=0.0205, prec@70=0.0377, rec@70=0.00375, prec@80=0.0423, rec@80=0.00165, prec@90=0.0734, rec@90=0.000732, prec@95=0.0784, rec@95=0.000366, loss.loc_elem=[0.135, 0.109, 0.135, 0.056, 0.112, 0.0804, 0.221], loss.cls_pos_rt=0.497, loss.cls_neg_rt=0.0565, loss.dir_rt=0.805, num_vox=337, num_pos=22, num_neg=981, num_anchors=1030, lr=0.0002, image_idx=2983
step=300, steptime=0.0569, cls_loss=1.13, cls_loss_rt=0.49, loc_loss=3.59, loc_loss_rt=2.01, rpn_acc=0.961, prec@10=0.0245, rec@10=0.884, prec@30=0.0211, rec@30=0.187, prec@50=0.0212, rec@50=0.0174, prec@70=0.0377, rec@70=0.00319, prec@80=0.0423, rec@80=0.0014, prec@90=0.0734, rec@90=0.000622, prec@95=0.0784, rec@95=0.000311, loss.loc_elem=[0.0888, 0.0439, 0.289, 0.0643, 0.112, 0.0827, 0.325], loss.cls_pos_rt=0.401, loss.cls_neg_rt=0.0888, loss.dir_rt=0.725, num_vox=476, num_pos=28, num_neg=1348, num_anchors=1424, lr=0.0002, image_idx=5868
step=350, steptime=0.069, cls_loss=1.03, cls_loss_rt=0.651, loc_loss=3.32, loc_loss_rt=3.04, rpn_acc=0.963, prec@10=0.0251, rec@10=0.891, prec@30=0.0216, rec@30=0.162, prec@50=0.0212, rec@50=0.0147, prec@70=0.0377, rec@70=0.0027, prec@80=0.0423, rec@80=0.00118, prec@90=0.0734, rec@90=0.000526, prec@95=0.0784, rec@95=0.000263, loss.loc_elem=[0.121, 0.193, 0.51, 0.0662, 0.169, 0.108, 0.353], loss.cls_pos_rt=0.592, loss.cls_neg_rt=0.0593, loss.dir_rt=0.839, num_vox=91, num_pos=22, num_neg=787, num_anchors=836, lr=0.0002, image_idx=3930
Traceback (most recent call last):
File "./pytorch/train.py", line 643, in <module>
fire.Fire()
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "./pytorch/train.py", line 398, in train
raise e
File "./pytorch/train.py", line 234, in train
example = next(data_iter)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 336, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/par/Documents/second.pytorch/second/pytorch/builder/input_reader_builder.py", line 42, in __getitem__
return self._dataset[idx]
File "/home/par/Documents/second.pytorch/second/data/dataset.py", line 68, in __getitem__
prep_func=self._prep_func)
File "/home/par/Documents/second.pytorch/second/data/preprocess.py", line 350, in _read_and_prep_v9
example = prep_func(input_dict=input_dict)
File "/home/par/Documents/second.pytorch/second/data/preprocess.py", line 293, in prep_pointcloud
unmatched_thresholds=unmatched_thresholds)
File "/home/par/Documents/second.pytorch/second/core/target_assigner.py", line 56, in assign
box_code_size=self.box_coder.code_size)
File "/home/par/Documents/second.pytorch/second/core/target_ops.py", line 103, in create_target_np
gt_to_anchor_argmax = anchor_by_gt_overlap.argmax(axis=0)
ValueError: attempt to get argmax of an empty sequence
from second.pytorch.
ValueError is fixed now but yet, I receive a new one.
Maybe I should just change my GPU :-)
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead.
warnings.warn(warning.format(ret))
Traceback (most recent call last):
File "./pytorch/train.py", line 643, in <module>
fire.Fire()
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "./pytorch/train.py", line 398, in train
raise e
File "./pytorch/train.py", line 234, in train
example = next(data_iter)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 336, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
UnboundLocalError: Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/par/Documents/second.pytorch/second/pytorch/builder/input_reader_builder.py", line 42, in __getitem__
return self._dataset[idx]
File "/home/par/Documents/second.pytorch/second/data/dataset.py", line 68, in __getitem__
prep_func=self._prep_func)
File "/home/par/Documents/second.pytorch/second/data/preprocess.py", line 350, in _read_and_prep_v9
example = prep_func(input_dict=input_dict)
File "/home/par/Documents/second.pytorch/second/data/preprocess.py", line 293, in prep_pointcloud
unmatched_thresholds=unmatched_thresholds)
File "/home/par/Documents/second.pytorch/second/core/target_assigner.py", line 56, in assign
box_code_size=self.box_coder.code_size)
File "/home/par/Documents/second.pytorch/second/core/target_ops.py", line 160, in create_target_np
labels[anchors_with_max_overlap] = gt_classes[gt_inds_force]
UnboundLocalError: local variable 'gt_inds_force' referenced before assignment
from second.pytorch.
should be fixed in 57af33a. (Why a simple commit always close this issue...)
This is a code problem, should only occur when remove_environment=True.
If you just debug or learn my code, this gpu is enough. you should use better gpu for long-time training, at lease a 1060 6G.
from second.pytorch.
Error is fixed 👍
Now I got the next one :'-)
Maybe I should debug on my own instead of annoying you with
those little bugs :-)
Traceback (most recent call last):
File "./pytorch/train.py", line 643, in <module>
fire.Fire()
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/local/lib/python3.6/dist-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "./pytorch/train.py", line 398, in train
raise e
File "./pytorch/train.py", line 245, in train
ret_dict = net(example_torch)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/par/Documents/second.pytorch/second/pytorch/models/voxelnet.py", line 671, in forward
voxel_features = self.voxel_feature_extractor(voxels, num_points)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in __call__
result = self.forward(*input, **kwargs)
File "/home/par/Documents/second.pytorch/second/pytorch/models/voxelnet.py", line 127, in forward
points_mean = features[:, :, :3].sum(
IndexError: too many indices for tensor of dimension 1
from second.pytorch.
you can set remove_environment=False (recommend), or add following code in function train in train.py before ret_dict = net(example_torch)
:
if (example["voxels"].shape[0] == 0):
continue
remove_environment
should only be used for debugging. reset it if possible.
from second.pytorch.
Now it works perfect! Thank you so much!
from second.pytorch.
Related Issues (20)
- KeyError: 'annotations' when using nuscenes dataset
- KeyError: 'annotations' when using nuscenes dataset HOT 3
- How to start training from interrupted step HOT 1
- About gt_sampling
- ModuleNotFoundError: No module named 'second' HOT 1
- Kitti web viewer backend issue HOT 1
- second.pytorch 1.6.0 Alpha and spconv 2.1 HOT 2
- Convert custom Lidar point cloud data to KITTI format HOT 1
- Summary name eval.kitti/official/Car/[email protected]/1 is illegal; using eval.kitti/official/Car/3d_0.70/1 instead.
- Issues while using Kitti viewer. HOT 1
- Need tips on improving performance on custom dataset
- Need suggestions on how to generate onnx files using this repo
- Source of torchplus package
- OSS License compatibility question
- About kitti viewer
- second.data HOT 2
- I would like to know how many samples should be used for validation if 3517 samples are used for training?
- Is the sparse library writed by the autuor?
- Input only the desired scene
- 有人试过把这个代码在windows上运行吗
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from second.pytorch.