3.8 KiB
IGEV-Stereo & IGEV-MVS
This repository contains the source code for our paper:
Iterative Geometry Encoding Volume for Stereo Matching
CVPR 2023
Gangwei Xu, Xianqi Wang, Xiaohuan Ding, Xin Yang
Demos
Pretrained models can be downloaded from google drive
You can demo a trained model on pairs of images. To predict stereo for Middlebury, run
python demo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth
Comparison with RAFT-Stereo
Method | KITTI 2012 (3-noc) |
KITTI 2015 (D1-all) |
Memory (G) | Runtime (s) |
---|---|---|---|---|
RAFT-Stereo | 1.30 % | 1.82 % | 1.02 | 0.38 |
IGEV-Stereo | 1.12 % | 1.59 % | 0.66 | 0.18 |
Environment
- NVIDIA RTX 3090
- Python 3.8
- Pytorch 1.12
Create a virtual environment and activate it.
conda create -n IGEV_Stereo python=3.8
conda activate IGEV_Stereo
Dependencies
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib
pip install tqdm
pip install timm==0.5.4
Required Data
To evaluate/train IGEV-Stereo, you will need to download the required datasets.
By default stereo_datasets.py
will search for the datasets in these locations.
├── /data
├── sceneflow
├── frames_finalpass
├── disparity
├── KITTI
├── KITTI_2012
├── training
├── testing
├── vkitti
├── KITTI_2015
├── training
├── testing
├── vkitti
├── Middlebury
├── trainingH
├── trainingH_GT
├── ETH3D
├── two_view_training
├── two_view_training_gt
├── DTU_data
├── dtu_train
├── dtu_test
Evaluation
To evaluate on Scene Flow or Middlebury or ETH3D, run
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset sceneflow
or
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset middlebury_H
or
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset eth3d
Training
To train on Scene Flow, run
python train_stereo.py
To train on KITTI, run
python train_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset kitti
Submission
For submission to the KITTI benchmark, run
python save_disp.py
MVS training and evaluation
To train on DTU, run
python train_mvs.py
To evaluate on DTU, run
python evaluate_mvs.py
Citation
If you find our work useful in your research, please consider citing our paper:
@inproceedings{xu2023iterative,
title={Iterative Geometry Encoding Volume for Stereo Matching},
author={Xu, Gangwei and Wang, Xianqi and Ding, Xiaohuan and Yang, Xin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2023}
}
Acknowledgements
This project is heavily based on RAFT-Stereo, We thank the original authors for their excellent work.