This is an ongoing PyTorch implementation for PoseNet, developed based on Pix2Pix code.
- Linux
- Python 3.5.2
- CPU or NVIDIA GPU + CUDA CuDNN
- Install PyTorch and dependencies from http://pytorch.org
- Install Torch vision from the source.
- Clone this repo:
git clone https://github.com/hazirbas/posenet-pytorch
cd posenet-pytorch
pip install -r requirements.txt
- Download a Cambridge Landscape dataset (e.g. KingsCollege) under datasets/ folder.
- Compute image mean
python util/compute_image_mean.py --dataroot datasets/KingsCollege --height 256 --width 455 --save_resized_imgs
- Train a model:
CUDA_VISIBLE_DEVICES='0' python train.py --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500 --beta 500 --niter 500
- To view training errors and loss plots, run
python -m visdom.server
and click the URL http://localhost:8097. Checkpoints are saved under./checkpoints/posenet/KingsCollege/beta500/
. - Test the model:
CUDA_VISIBLE_DEVICES='0' python test.py --dataroot ./datasets/KingsCollege --name posenet/KingsCollege/beta500
The test errors will be saved to a text file under ./results/posenet/KingsCollege/beta500/
.
If you would like to initialize the model with pretrained weights, download the places-googlenet.pickle file under pretrained_models/ folder:
wget https://vision.in.tum.de/webarchive/hazirbas/posenet-pytorch/places-googlenet.pickle
We use the training scheme defined in PoseLSTM. Best models are determined by the median position and quaternion errors.
Dataset | beta | TFPoseNet | PyPoseNet | pymodel |
---|---|---|---|---|
KingsCollege | 500 | 1.92m 5.40° | 1.72m 5.40° | epoch445 |
OldHospital | 1500 | 2.31m 5.38° | 2.40m 5.71° | epoch375 |
ShopFacade | 100 | 1.46m 8.08° | 1.26m 7.55° | epoch350 |
StMarysChurch | 250 | 2.65m 8.48° | 2.54m 9.62° | epoch500 |
@inproceedings{PoseNet15,
title={PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization},
author={Alex Kendall, Matthew Grimes and Roberto Cipolla },
journal={ICCV},
year={2015}
}
@inproceedings{PoseLSTM17,
author = {Florian Walch and Caner Hazirbas and Laura Leal-Taixé and Torsten Sattler and Sebastian Hilsenbeck and Daniel Cremers},
title = {Image-based localization using LSTMs for structured feature correlation},
month = {October},
year = {2017},
booktitle = {ICCV},
eprint = {1611.07890},
url = {https://github.com/NavVisResearch/NavVis-Indoor-Dataset},
}
Code is inspired by pytorch-CycleGAN-and-pix2pix.