Coder Social home page Coder Social logo

karnikram / rp-vio Goto Github PK

View Code? Open in Web Editor NEW
205.0 9.0 35.0 60.67 MB

[IROS-21] RP-VIO: Robust Plane-based Visual-Inertial Odometry for Dynamic Environments (Code & Dataset)

Home Page: https://karnikram.info/rp-vio

License: GNU General Public License v3.0

CMake 3.91% C++ 86.89% Python 8.32% Shell 0.64% C 0.24%
vio slam

rp-vio's Introduction

RP-VIO: Robust Plane-based Visual-Inertial Odometry for Dynamic Environments

Karnik Ram, Chaitanya Kharyal, Sudarshan S. Harithas, K. Madhava Krishna.

[arXiv] [Project Page]

In IROS 2021

RP-VIO is a monocular visual-inertial odometry (VIO) system that uses only planar features and their induced homographies, during both initialization and sliding-window estimation, for increased robustness and accuracy in dynamic environments.

Setup

Our evaluation setup is a 6-core Intel Core i5-8400 CPU with 8GB RAM and a 1 TB HDD, running Ubuntu 18.04.1. We recommend using a more powerful setup, especially for heavy datasets like ADVIO or OpenLORIS.

Pre-requisites

ROS Melodic (OpenCV 3.2.0, Eigen 3.3.4-4)
Ceres Solver 1.14.0
EVO

The versions indicated are the versions used in our evaluation setup, and we do not guarantee our code to run on newer versions like ROS Noetic (OpenCV 4.2).

Build

Run the following commands in your terminal to clone our project and build,

    cd ~/catkin_ws/src
    git clone https://github.com/karnikram/rp-vio.git
    cd ../
    catkin_make -j4
    source ~/catkin_ws/devel/setup.bash

Evaluation

We provide evaluation scripts to run RP-VIO on the RPVIO-Sim dataset, and select sequences from the OpenLORIS-Scene, ADVIO, and VIODE datasets. The output errors from your evaluation might not be exactly the same as reported in our paper, but should be similar.

RPVIO-Sim Dataset

Download the dataset files to a parent folder, and then run the following commands to launch our evaluation script. The script runs rp-vio on each of the six sequences once and computes the ATE error statistics.

    cd ~/catkin_ws/src/rp-vio/scripts/
    ./run_rpvio_sim.sh <PATH-TO-DATASET-FOLDER>

To run the multiple planes version (RPVIO-Multi), checkout the corresponding branch by running git checkout rpvio-multi, and re-run the above script.

Real-world sequences

We evaluate on two real-world sequences: the market1-1 sequence from the OpenLORIS-Scene dataset and the metro station sequence (12) from the ADVIO dataset. Both of these sequences along with their segmented plane masks are available as bagfiles for download here. After downloading and extracting the files run the following commands for evaluation,

    cd ~/catkin_ws/src/rp-vio/scripts/
    ./run_ol_market1.sh <PATH-TO-EXTRACTED-DATASET-FOLDER>
    ./run_advio_12.sh <PATH-TO-EXTRACTED-DATASET-FOLDER>

Own data

To run RP-VIO on your own data, you need to provide synchronized monocular images, IMU readings, and plane masks on three separate ROS topics. The camera and IMU need to be properly calibrated and synchronized as there is no online calibration. A plane segmentation model to segment plane masks from images is provided below.

A semantic segmentation model can also be used as long as the RGB labels of the (static) planar semantic classes are provided. As an example, we evaluate on a sequence from the VIODE dataset (provided here) using semantic segmentation labels which are specified in the config file. To run,

    cd ~/catkin_ws/src/rp-vio/scripts
    git checkout semantic-viode
    ./run_viode_night.sh <PATH-TO-EXTRACTED-DATASET-FOLDER>

Plane segmentation

We provide a pre-trained plane instance segmentation model, based on the Plane-Recover model. We retrained their model, with an added inter-plane constraint, on their SYNTHIA training data and two additional sequences (00,01) from the ScanNet dataset. The model was trained on a single Titan X (maxwell) GPU for about 700K iterations. We also provide the training script.

We run the model offline, after extracting and processing the input RGB images from their ROS bagfiles. Follow the steps given below to run the pre-trained model on your custom dataset (requires CUDA 9.0),

Create Environment

Run the following commands to create a suitable conda environemnt,

cd plane_segmentation/
conda create --name plane_seg --file requirements.txt
conda activate plane_seg

Run inference

Now extract images from your dataset to a test folder, resize them to (192,320) (height, width), and run the following,

python inference.py --dataset=<PATH_TO_DATASET> --output_dir=<PATH_TO_OUTPUT_DIRECTORY> --test_list=<TEST_DATA_LIST.txt FILE> --ckpt_file=<MODEL> --use_preprocessed=true 

TEST_DATA_LIST.txt is a file that points to every image within the test dataset, an example can be found here. PATH_TO_DATASET is the path to the parent directory of the test folder.

The result of the inference would be a stored in three folders that are named as plane_sgmts (predicted masks in grayscale), plane_sgmts_vis (predicted masks in color), plane_sgmts_modified (grayscale masks but suitable for visualization (feed this output to the CRF inference)).

Run CRF inference

We also use a dense CRF model (from PyDenseCRF) to further refine the output masks. To run,

python crf_inference.py <rgb_image_dir> <labels_dir> <output_dir>

where the labels_dir is the path to the plane_sgmts_modified folder.

We then write these outputs mask images back into the original bagfile on a separate topic for running with RP-VIO.

RPVIO-Sim Dataset


For an effective evaluation of the capabilities of modern VINS systems, we generate a highly-dynamic visual-inertial dataset using AirSim which contains dynamic characters present throughout the sequences (including initialization), and with sufficient IMU excitation. Dynamic characters are progressively added, keeping everything else fixed, starting from no characters in the static sequence to eight characters in the C8 sequence. All the generated sequences (six) in rosbag format, along with their groundtruth files, have been made available via Zenodo.

Each rosbag contains RGB images published on the /image topic at 20 Hz, imu measurements published on the/imu topic at ~1000 Hz (which we sub-sample to 200Hz for our evaluations), and plane-instance mask images published on the/mask topic at 20 Hz. The groundtruth trajectory is saved as a txt file in TUM format. The parameters for the camera and IMU used in our dataset are as follows,

To quantify the dynamic nature of our generated sequences, we compute the percentage of dynamic pixels out of all the pixels present in every image. We report these values in the following table,

TO-DO

  • Provide Unreal Engine environment
  • Provide AirSim recording scripts

Acknowledgement

Our code is built upon VINS-Mono. Its implementations of feature tracking, IMU preintegration, IMU state initialization, the reprojection factor, and marginalization are used as such. Our contributions include planar features tracking, planar homography based initialization, and the planar homography factor. All these changes (corresponding to a slightly older version) are available as a git patch file.

For our simulated dataset, we imported several high-quality assets from the FlightGoggles project into Unreal Engine before integrating it with AirSim. The dynamic characters were downloaded from Mixamo.

rp-vio's People

Contributors

karnikram avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rp-vio's Issues

plane segmentation

Hi, I try to use the plane segmentation in the ADVIO dataset you provides, however it cant perforn like the mask you provide after CRF interface.
I use the pretrained model you provide. and the raw image has resized already. But this stiil some hard code in interface.py like plane_num.
I dont know what I did wrong, could pls help me?

Not enough features or parallex problem

Hi, Thanks for sharing your remarkable work!
I want to run the RP-VIO algorithm on KAIST urban dataset, which is a highway scene, and i segmented road only. However, the code throws warnings "No enough features or parallex; Move device around" all the time as following screenshot, and the output of the system is an empty "*.csv" file. Could you give me some advices to fix such problems?
1
Looking forward to your reply!

Problem with the inference.py

Hi, when I run python inference.py, I got this:

Traceback (most recent call last):
  File "inference.py", line 14, in <module>
    from RecoverPlane import RecoverPlane
ModuleNotFoundError: No module named 'RecoverPlane'

How can I fix this error ?

AirSim recording scripts

hi, thanks for sharing your great work!
I am interested in the AirSim recording scripts
Do you have any plans to add the AirSim recording scripts ?

no trajectory displayed

Sorry to bother you again, rviz does not show the trajectory when I run the ADVIO dataset downloaded from your official website, what could be the reason?
Screenshot from 2022-02-18 11-53-11

How do you align the data of d435i accelerometer and gyroscope to / imu0 topic?

Hi,

I appreciate you can answer my question,

Thank you for your open source code and datasets.

Your work has been of a great help to my research.

I'd like to ask you another question.

For the market1-1 sequence from the OpenLORIS-Scene dataset, how do you align the data of d435i accelerometer and gyroscope to one topic /imu0?

And Could you open source the datasets market1-2 and market1-3 that align imu0 data? I also want to test it.

Thank you very much!

> Hi,

Hi,
I run vins-mono with your config file 3 times and got 1.55, 2.22, and 1.65 rmse.
The results from VINS-Mono are more than RP-VIO.

Have you run the segmentation part ?
When I run python inference.py, I got this:

from RecoverPlane import RecoverPlane
ModuleNotFoundError: No module named 'RecoverPlane

Have you met this error before?

Originally posted by @Gatsby23 in #2 (comment)

ERROR while running: calculate Homography

Hello, thanks for sharing.
I run the datasets 12-mask.bag and have this error:
$roslaunch rpvio_estimator advio_12.launch bagfile_path:=/media/li/DATA/DATASET/RP-VIO/ADVIO/12-mask.bag
OpenCV Error: Unknown error code -28 (The input arrays should have at least 4 corresponding point sets to calculate Homography) in findHomography
Did you ever meet this situation?
Screenshot from 2022-01-24 17-13-26
By the way, I have tried other datasets and also met this problem.
The modifications I made to the code were only adding 'std::' in the 'map<int, std::array<double,3>> para_N;' because 'error: template argument 2 is invalid #14' as mentioned.

errors when running codes

Hi! Thanks for sharing your remarkable works!

I have met some errors when running codes on ADVIO and VIODE dataset following the guidances. Could you give some advices on solving such problems?
On ADVIO:
1
On VIODE:
2
and the rviz's visualization seems proper:
3

Looking forward to your reply! Thanks!

night city case problem

Hi, sorry to bother.
I try to run the city_night in the code, And I use the config from the raw git
https://github.com/kminoda/VIODE/blob/master/calibration_files/cam0_pinhole.yaml
But The performance is bad, do you know I did any thing wrong? Thanks
Here is my config

%YAML:1.0

#common parameters
imu_topic: "/imu0"
image_topic: "/cam0/image_raw"
mask_topic: "/cam0/segmentation"
output_path: "/home/zhouxin/output/"

#camera calibration
model_type: PINHOLE
camera_name: camera
image_width: 752
image_height: 480
distortion_parameters:
k1: 0
k2: 0
p1: 0
p2: 0
projection_parameters:
fx: 376.0
fy: 376.0
cx: 376.0
cy: 240.0

estimate_extrinsic: 0
#Rotation from camera frame to imu frame, imu^R_cam
extrinsicRotation: !!opencv-matrix
rows: 3
cols: 3
dt: d
data: [1,0,0,0,1,0,0,0,1]

#Translation from camera frame to imu frame, imu^T_cam
extrinsicTranslation: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [0,0,0]

#feature traker paprameters
max_cnt: 120 # max feature number in feature tracking
min_dist: 30 # min distance between two features
freq: 10 # frequence (Hz) of publish tracking result. At least 10Hz for good estimation. If set 0, the frequence will be same as raw image
F_threshold: 1.0 # ransac threshold (pixel)
show_track: 1 # publish tracking image as topic
flow_back: 1 # perform forward and backward optical flow to improve feature tracking accuracy

#optimization parameters
max_solver_time: 0.04 # max solver itration time (ms), to guarantee real time
max_num_iterations: 8 # max solver itrations, to guarantee real time
keyframe_parallax: 10.0 # keyframe selection threshold (pixel)

#imu parameters The more accurate parameters you provide, the better performance
acc_n: 0.2 # accelerometer measurement noise standard deviation.
gyr_n: 0.05 # gyroscope measurement noise standard deviation.
acc_w: 0.02 # accelerometer bias random work noise standard deviation.
gyr_w: 4.0e-5 # gyroscope bias random work noise standard deviation.
g_norm: 9.81007 # gravity magnitude

#unsynchronization parameters
estimate_td: 0
td: 0.0 # initial value of time offset. unit s. readed image clock + td = real image clock (IMU clock)

#rolling shutter parameters
rolling_shutter: 0 # 0: global shutter camera, 1: rolling shutter camera
rolling_shutter_tr: 0 # unit: s. rolling shutter read out time per frame (from data sheet).

#visualization parameters
save_image: 1 # save image in pose graph for visualization prupose; you can close this function by setting 0
visualize_imu_forward: 0 # output imu forward propogation to achieve low latency and high frequence results
visualize_camera_size: 0.4 # size of camera marker in RVIZ

Run Real time with your own camera

Hi thanks for sharing your work.
I want to run the RP-VIO in real time on pixhawk drone.
How can provide mask(segmented) plane of camera? I am using mynt-eye and realsense 435i camera
thanks

Problem with the inference.py

Hi, Sorry to bothering you. But when I install some package that needed for plane recover library. There are also show:

/home/robotics/.local/lib/python3.6/site-packages/scipy/__init__.py:147: UserWarning: NumPy 1.14.5 or above is required for this version of SciPy (detected version 1.13.1)
  UserWarning)
Traceback (most recent call last):
  File "inference.py", line 14, in <module>
    from RecoverPlane import RecoverPlane
ModuleNotFoundError: No module named 'RecoverPlane'

But I found the RecoverPlane is in RecoverPlane_perpendicular.py python file, but I don't know how to let the inference.py to find this class ?

Topics synchronization

Hi,

Thank you for incredible work, I would like to ask how did you synchronize all the topics together since my record rosbag which includes semantic mask, rgb image and IMU are not able to synchronize and displayed "imu message in disorder". Thank you.

undefined reference

cv::imshow(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, cv::_InputArray const&) undefined reference

Question about the plane segmentation

Hi,I'm a freshman to the groundplane segmentation algorithm. About the inference part, how to edit the test list in the command, Like in readme.md:

 --test_list=<TEST_DATA_LIST.txt FILE> 

for example, if I want to segment the ground plane in the left image in kitti, what should I do? could you please give me an example about it? I'm appreciate a lot for your wonderful help.

ADVIO segmentation problem

Hi, I try to segment all the sequences in ADVIO dataset.
I successfully segment some cases like 10, 13, 23 and the rmse is great.
But I notice that the segmentation model is not so workable in highlighted corrior like in case 17.
And I also find that the distortion influences the results badly. In case 12, the segmentation results is better than the reults that I dont use the dirtortion coeffs.
So do you have any good solutions to the two problems, like some better distortion coeffs or models.
Thanks a lot!
And here is my results.
https://1drv.ms/u/s!AgIBb31yefjshg6ObUe0bhNB6OJR?e=Ej8Vol

Problem with the pretrained models:

Hi, After install the third_party. I get this error:

2021-06-28 22:43:38.525918: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on pre_trained_model/synthia_498000: Not found: pre_trained_model

My computer is Laptop Thinkpad P17, and the GPU is the RTX3000 Is this error with the GPU ?

unable to run

Hello, thanks for sharing. I run into the following situation, may I ask what is the cause of this?
455BD162-FEFF-4344-8378-55266C3CF19E

I used the original vinsmono and got different results from your paper. Why is that?

I used the ol_market1_config.yaml file of your rp-vio project to directly run VINS-MONO on the OpenLORIS-market-1 dataset, and got a higher accuracy than your paper (paper is 2.45, but I got Absolute Trajectory RMSE = 0.88). Why?

The evaluation tool I use is EVO, and the link is [https://github.com/MichaelGrupp/evo]

The command is evo_ape tum groundtruth.txt vins_result_no_loop.txt -va -p --plot_mode=xyz --align --correct_scale

Looking forward to your answer

Plane Segmentation requirements.txt

Hi, I am trying to run the plane segmentation. But when I try to build the environment with "conda create --name plane_seg --file requirements.txt",I got the message that the packages conflict, what should I do to solve the problem? Thank you for reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.