Camera pose estimation in multi-view environments: From virtual scenarios to the real world

https://doi.org/10.1016/j.imavis.2021.104182Get rights and content
Under a Creative Commons license
open access

Highlights

  • A domain adaptation strategy to efficiently train network architectures

  • Multi-view scenarios to estimate the relative camera pose

  • Transferring learned knowledge from virtual to real world

  • Evaluation of similarity of the scenarios and pose of the camera

  • Experimental results on six synthetic image datasets

Abstract

This paper presents a domain adaptation strategy to efficiently train network architectures for estimating the relative camera pose in multi-view scenarios. The network architectures are fed by a pair of simultaneously acquired images, hence in order to improve the accuracy of the solutions, and due to the lack of large datasets with pairs of overlapped images, a domain adaptation strategy is proposed. The domain adaptation strategy consists on transferring the knowledge learned from synthetic images to real-world scenarios. For this, the networks are firstly trained using pairs of synthetic images, which are captured at the same time by a pair of cameras in a virtual environment; and then, the learned weights of the networks are transferred to the real-world case, where the networks are retrained with a few real images. Different virtual 3D scenarios are generated to evaluate the relationship between the accuracy on the result and the similarity between virtual and real scenarios—similarity on both geometry of the objects contained in the scene as well as relative pose between camera and objects in the scene. Experimental results and comparisons are provided showing that the accuracy of all the evaluated networks for estimating the camera pose improves when the proposed domain adaptation strategy is used, highlighting the importance on the similarity between virtual-real scenarios.

Keywords

Relative camera pose estimation
Domain adaptation
Siamese architecture
Synthetic data
Multi-view environments

Cited by (0)