Coder Social home page Coder Social logo

openpilot-deepdive's Introduction

Openpilot-Deepdive

Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

image

Webpage | Paper | Zhihu


Introduction

This repository is the PyTorch implementation for our Openpilot-Deepdive. In contrast to most traditional autonomous driving solutions where the perception, prediction, and planning module are apart, Openpilot uses an end-to-end neural network to predict the trajectory directly from the camera images, which is called Supercombo. We try to reimplement the training details and test the pipeline on public benchmarks. Experimental results of OP-Deepdive on nuScenes, Comma2k19, CARLA, and in-house realistic scenarios (collected in Shanghai) verify that a low-cost device can indeed achieve most L2 functionalities and be on par with the original Supercombo model. We also test on CommaTwo device with a dual-model deploeyment framework, which is in this repo: Openpilot-Deployment.


Directory Structure

Openpilot-Deepdive 
├── tools           - Tools to generate split on Comma2k19 and nuScenes datasets.  
├── utils_comma2k19 - The utils provided by comma, copied from `commaai/comma2k19.git/utils`
├── data
      ├── nuscenes  -> soft link to the nusSenes-all dataset
      ├── comma2k19 -> soft link to the Comma2k19 dataset

Changelog

2022-6-17: We released the v1.0 code for Openpilot-Deepdive.

2022-6-26: We fix some problems and update the readme for using the code on bare-metal machines. Thanks @EliomEssaim and @MicroHest!

2022-7-13: We released the v1.0 code for Openpilot-Deployment for dual-model deployment in the Openpilot framework.


Quick Start Examples

Before starting, we refer you to read the arXiv to understand the details of our work.

Installation

Clone repo and install requirements.txt in a Python>=3.7.0 environment, including PyTorch>=1.7.

git clone https://github.com/OpenPerceptionX/Openpilot-Deepdive.git  # clone
cd Openpilot-Deepdive
pip install -r requirements.txt  # install

Dataset

We train and evaluate our model on two datasets, nuScenes and Comma2k19. The table shows some key features of them.

Dataset Raw
FPS (Hz)
Aligned&
FPS (Hz)
Length Per
Sequence
(Frames/Second)
Altogether
Length
(Minutes)
Scenario Locations
nuScenes 12 2 40 / 20 330 Street America
Singapore
Comma2k19 20 20 1000 / 60 2000 Highway America

Please create a data folder and create soft links to the datasets.

For dataset splits, you may create your own by running the scripts in the tools folder, or download it in https://github.com/OpenPerceptionX/Openpilot-Deepdive/issues/4.

Training and Testing

By default, the batch size is set to be 6 per GPU, which consumes 27 GB GPU memory. When using 8 V100 GPUs, it takes approximate 120 hours to train 100 epochs on Comma2k19 dataset.

Note: Our lab use slurm to run and manage the tasks. Then, the PyTorch distributed training processes are initialized manually by slurm, since the automatic mp.spawn may cause unknown problems on slurm clusters. For most people who do not use a cluster, it's okay to launch the training process on bare-metal machines, but you will have to open multiple terminals and set some environmental variables manually if you want to use multiple GPUs. We will explain it below.

Warning: Since we have to extract all the frames from the video before sending them into the network, the program is hungry for memory. The actual memory usage is related to batch_size and n_workers. By default, each process with n_workers=4 and batch_size=6 consumes around 40 to 50 GB memory. You'd better open an htop to monitor the memory usage, before the machine hangs.

# Training on a slurm cluster
export DIST_PORT = 23333  # You may use whatever you want
export NUM_GPUS = 8
PORT=$DIST_PORT$ srun -p $PARTITION$ --job-name=openpilot -n $NUM_GPUS$ --gres=gpu:$NUM_GPUS$ --ntasks-per-node=$NUM_GPUS$ python main.py
# Training on a bare-metal machine with a single GPU
PORT=23333 SLURM_PROCID=0 SLURM_NTASKS=1 python main.py
# Training on a bare-metal machine with multiple GPUs
# You need to open multiple terminals

# Let's use 4 GPUs for example
# Terminal 1
PORT=23333 SLURM_PROCID=0 SLURM_NTASKS=4 python main.py
# Terminal 2
PORT=23333 SLURM_PROCID=1 SLURM_NTASKS=4 python main.py
# Terminal 3
PORT=23333 SLURM_PROCID=2 SLURM_NTASKS=4 python main.py
# Terminal 4
PORT=23333 SLURM_PROCID=3 SLURM_NTASKS=4 python main.py
# Then, the training process will start after all 4 processes are launched.

By default, the program will not output anything once the training process starts, for the widely-used tqdm might be buggy on slurm clusters. So, you may see some debugging info like the one below and the program seems to be stuck.

[1656218909.68] starting job... 0 of 1
[1656218911.53] DDP Initialized at localhost:23333 0 of 1
Comma2k19SequenceDataset: DEMO mode is on.
Loaded pretrained weights for efficientnet-b2

Don't worry, you can open a tensorboard to see the loss and validation curves.

tensorboard --logdir runs --bind_all

Otherwise, you may want to parse --tqdm=True to show the progress bar in Terminal 1.

By default, the test process will be executed once every epoch. So we did not implement the independent test script.

Demo

See more demo and test cases on our webpage

You can generate your own demo video using demo.py. It will generate some frames in the ./vis folder. (You may have to create it first.) Then, you can generate a video using ffmpeg.

output.mp4

Baselines

Here we list several baselines to perform trajectory prediction task on different datasets. You are welcome to pull request and add your work here!

nuScenes

Method [email protected](0-10) AP@1(10-20) AP@1(20-30) AP@1(30-50)
Supercombo 0.237 0.064 0.038 0.053
Supercombo-finetuned 0.305 0.162 0.088 0.050
OP-Deepdive (Ours) 0.28 0.14 0.067 0.038

Comma2k19

Method [email protected](0-10) AP@1(10-20) AP@1(20-30) AP@1(30-50) AP@2(50+) Average Jerk*
Supercombo 0.7966 0.6170 0.2661 0.0889 0.0062 2.2243
OP-Deepdive (Ours) 0.909 0.808 0.651 0.465 0.239 4.7959

*: The lower, the better. To comparison, the average jerk of human driver's trajectories is 0.3232 m/s^2.


Citation

Please use the following citation when referencing our repo or arXiv.

@article{chen2022op,
   title={Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot},
   author={Li Chen and Tutian Tang and Zhitian Cai and Yang Li and Penghao Wu and Hongyang Li and Jianping Shi and Junchi Yan and Yu Qiao},
   journal={arXiv preprint arXiv:2206.08176},
   year={2022}
}

License

All code within this repository is under Apache License 2.0.


openpilot-deepdive's People

Contributors

caizhitian avatar electronicelephant avatar faikit avatar hli2020 avatar ilnehc avatar penghao-wu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openpilot-deepdive's Issues

some questiones on nuscene

Thanks for your great work!
However i meet a problem when i train and demo for nuscene
I had already follow your issue(https://github.com/OpenPerceptionX/Openpilot-Deepdive/issues/9)
Maybe because i just train for 10epoch in nuscene mini batch, when i demo it, the prediction is all equal to 0
Have you ever meet it?
And i changed the ground truth to draw on the pic, the direction also seem not right
Car was clearly going to the left
image

genetic -> generic ?

Frankly speaking, we do not know the answer, either. Of course there are some genetic solutions to make the model robust, like adding more data, making the model deeper and larger, etc., but they do not help solve a specific problem

Pip install failed.

The pip install failed, because the python dev kit was not installed on my system. Perhaps one could mention in the readme that the python dev kit is required, so the next person might not have the same problem:

The install of python dev kit for Ubuntu eg:

sudo apt-get install python3-dev

from this Stackoverflow answer.

The error message is not clear about why it fails, an intensive google search is needed to fix the error.

The length of sequence is too short

Hi, I am getting the following warning during training, but it doesn't seem to affect my training process. At the same time, I used 4
2080ti and set the batch_size to 2 for training. The loss did not converge. I am not sure whether it is caused by the above problem, so I would like to ask you the reason for the above warning.
image

How to set up the dataset

  ├── nuscenes  -> soft link to the nusSenes-all dataset
  ├── comma2k19 -> soft link to the Comma2k19 dataset

The trajectory predicted by the model tends to be a straight line

During the training process using the comma2k19 dataset, the trajectory predicted by the model has always tended to be a straight line. Curved corners do not work well.

Did you encounter this problem during training? How was it resolved? Can you provide your trained model and configuration?

模型训练问题

模型的输出始终是一条直线,可能是样本失衡的问题,请问你们有没有遇到这样的问题,用的什么解决方法

training error

Hi,

I was training with comma2k19 with two A6000 GPU cards in a PC with CUDA 11.5, Ubuntu 20.04, with two terminals running each

PORT=23345 SLURM_PROCID=0 SLURM_NTASKS=2 python main.py
PORT=23346 SLURM_PROCID=1 SLURM_NTASKS=2 python main.py

I got below error from the first terminal after started. I also tried with one GPU card but it also gave same error. How can I solve this? Thanks.

[1676912307.07] starting job... 0 of 2
[1676912608.11] DDP Initialized at localhost:23345 0 of 2
2023-02-20 09:03:28.404838: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
Comma2k19SequenceDataset: DEMO mode is on.
Traceback (most recent call last):
File "main.py", line 246, in
main(rank=int(os.environ['SLURM_PROCID']), world_size=int(os.environ['SLURM_NTASKS']), args=args)
File "main.py", line 119, in main
train_dataloader, val_dataloader = get_dataloader(rank, world_size, args.batch_size, False, args.n_workers)
File "main.py", line 69, in get_dataloader
train_sampler = DistributedSampler(train, **dist_sampler_params)
TypeError: init() got an unexpected keyword argument 'drop_last'

双路模型部署

双路模型部署里的模型是自己训练的吗? onnx 模型可视化发现模型结构和自己训练的结构不一样?

ModuleNotFoundError: No module named 'petrel_client'

File "/data/workspace/user/projects/OpenPilot/Openpilot-Deepdive/data.py", line 49, in _init_mc_
    from petrel_client.client import Client
ModuleNotFoundError: No module named 'petrel_client'

Hi,

I followed the steps in the readme. And my machine has executed "pip install -r requirements.txt " successfully.
When I start training, this error appears.
After searching for solutions, I installed petrel. But it seems to contribute nothing to this error.

I'm not familiar with this module which cannot be found in PyPI.

Could you give some advice to solve it? Thanks.

About model deployment

About model deployment,Is there any relevant guidance on how to replace the official "plan" channel? Thank you.

About GRU training method.

I used to train GRU. Each sample data prepare a sequence of data.
However, the program train GRU by a sample with a sequence length of 1. And it backward for every 40 steps.
This seems to be equal to preparing a sample with a sequence length of 40. But actually, the program doesn't clear the variable named hidden, just detach it. This makes a difference in preparing a sample with a sequence of 40, as the latter clear variable named hidden into zero before computing.

https://github.com/OpenPerceptionX/Openpilot-Deepdive/blob/017d422f41216110043f9cda4d27658b7c3d92d9/main.py#L166

Could you tell me what's the purpose of applying such a training method in GRU? That seems not to be explained well in paper. Thanks.

Question about the experiment

Greetings, thanks for your reproduction of Openpilot, which has greatly advanced the work of freelance scholars in this field. Also, your experimental results have once again validated that Great Truths Are All Simple. I noticed in the experiments that you fine-tuned the SuperCombo model based on your pipeline and achieved better results than the baseline on the NuScense dataset (AP@2 0.809 vs. 0.692), but as far as I know SuperCombo is not open source, I would like to know if fine-tuning here refers to the OP- Deepdive trained on Comma2k19 and fine-tuned on NuScense dataset?
image

calibration

def calibration(extrinsic_matrix, cam_intrinsics, device_frame_from_road_frame=None):
    if device_frame_from_road_frame is None:
        device_frame_from_road_frame = np.hstack((np.diag([1, -1, -1]), [[0], [0], [1.51]]))
    med_frame_from_ground = medmodel_intrinsics@view_frame_from_device_frame@device_frame_from_road_frame[:,(0,1,3)]
    ground_from_med_frame  = np.linalg.inv(med_frame_from_ground)


    extrinsic_matrix_eigen = extrinsic_matrix[:3]
    camera_frame_from_road_frame = np.dot(cam_intrinsics, extrinsic_matrix_eigen)
    camera_frame_from_ground = np.zeros((3,3))
    camera_frame_from_ground[:,0] =  camera_frame_from_road_frame[:,0]
    camera_frame_from_ground[:,1] =  camera_frame_from_road_frame[:,1]
    camera_frame_from_ground[:,2] =  camera_frame_from_road_frame[:,3]
    warp_matrix = np.dot(camera_frame_from_ground, ground_from_med_frame)

    return warp_matrix

这个函数能讲解下吗? 我大概明白这是相机到虚拟相机的转换矩阵计算

[DEPLOYMENT model] Need a guidance to get the correct output on the frame for Supercombo model

Hi, Thanks for the great deepdive!
Btw, Could I ask you two questions:

  1. The first 4955 output of supercombo's onnx model contains 5 lane plans, but what is the sequence of it? FYI, I've tried to follow this REFERENCE, but the sequence of the output seems to be a bit off, could you help to CMIIW?

  1. Is the way to get best lane lines similar to the demo.py file? FYI, I've tried to reproduce it, but It seems that in the demo.py, the pred_trajectory seems to already have xyz separated before output, and you also have put "sinh" and "exp" in here.

What I tried to get the xyz (cmiiw again):

  • Get the first 4955 output, and reshape it into 5x991.
  • Try to separate the hypothesis probs. I've learned from the ref. above that the last shape should be 2x33x15 (990). If I take the last value, I get a very high score (43.xx). So, I guess it should be the last one (cmiiw), then put softmax on it.
    • The cls/hypothesis probs size: 5x1 (HP), the other values (B) will be reshape to 5x2x33x5x3.
  • Get the highest HP's output (HL). (argmax, out size will be 1)
  • Filter B with HL # the out size will be 2x33x5x3
  • Takes the second output (since it will be the current lane, cmiiw). the out size will be 33x5x3.
  • Takes the xyz position out of 5 outputs, the out size will be 33x3.
  • Put "sinh" & "exp" on the first two outputs.
  • Then, do you know what else should I do to get the correct left & right lanes on the frame ? Does it related to the ref below? Or could you give me a little guide? thanks in advance
    https://github.com/OpenDriveLab/Openpilot-Deepdive/blob/main/demo.py#L86
    https://github.com/OpenDriveLab/Openpilot-Deepdive/blob/main/utils.py#L151-L157
res = (array of 1 x 6742) # (output of supercombo.onnx)
lanes_plan_ = res[0,:4995].reshape(5, 991)
lanes_cls = ss_softmax(lanes_plan_[:,0])
lanes_plan = lanes_plan_[:,1:].reshape(5,2,33,5,3)[np.argmax(lanes_cls)]
lanes_plan_xyz = lanes_plan[1,:,0,:]
lanes_plan_xyz[...,0] = np.exp(lanes_plan_xyz[...,0])
lanes_plan_xyz[...,1] = np.sinh(lanes_plan_xyz[...,1])

How much CPU RAM and GPU memory is used in this project?

I run this project. But it seems that I‘m either out of RAM or GPU memory.

My computer has 8 NVIDIA GeForce RTX 3090 devices, each device has 24G memory roughly.
It has 2 Intel(R) Xeon(R) Gold 6226R CPUs, each CPU has 26 cores.
The ram size is 376GiB in total.

At first, I run this project with the default setting according to the paper.

# boom with CUDA out of memory.
parser.add_argument('--batch_size', type=int, default=48)
parser.add_argument('--lr', type=float, default=1e-4)
parser.add_argument('--n_workers', type=int, default=8)
parser.add_argument('--epochs', type=int, default=100)
parser.add_argument('--log_per_n_step', type=int, default=20)
parser.add_argument('--val_per_n_epoch', type=int, default=1)

parser.add_argument('--resume', type=str, default='')

parser.add_argument('--M', type=int, default=5)
parser.add_argument('--num_pts', type=int, default=33)
parser.add_argument('--mtp_alpha', type=float, default=1.0)
parser.add_argument('--optimizer', type=str, default='adamw')

Then I tried a smaller batch size of 8 but failed with CUDA out of memory. Finally, I modified optimize_per_n_step to 20.
Well, this time it works. But after a short time, I found my work process 2 exited unexcepted. After observing the top command's panel, I found my computer had run out of RAM before the process exited.

Finally, my computer works well in configuration below.

# optimize_per_n_step is 20.
parser.add_argument('--batch_size', type=int, default=8) # changed
parser.add_argument('--lr', type=float, default=1e-4)
parser.add_argument('--n_workers', type=int, default=2) # changed
parser.add_argument('--epochs', type=int, default=100)
parser.add_argument('--log_per_n_step', type=int, default=20)
parser.add_argument('--val_per_n_epoch', type=int, default=1)

parser.add_argument('--resume', type=str, default='')

parser.add_argument('--M', type=int, default=5)
parser.add_argument('--num_pts', type=int, default=33)
parser.add_argument('--mtp_alpha', type=float, default=1.0)
parser.add_argument('--optimizer', type=str, default='adamw') 

Could you tell me how much CPU RAM and GPU memory is recommended for this project? Or your computer info.

supercombo_extra 部署

supercombo_extra 模型的训练输入是rgb, 部署时为啥和原始模型的输入一样?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.