Skip to main navigation Skip to search Skip to main content

Self-Supervised Learning for Monocular Depth Estimation on Minimally Invasive Surgery Scenes

  • Beihang University
  • Shandong University
  • SUNY Buffalo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Self-supervised learning algorithms that compute depth map from monocular videos have achieved remarkable performance on urban scenes and have been applied extensively. These techniques still face significant challenges, however, when applied directly to endoscopic videos because of the brightness variations from frame to frame and inadequate representation learning during the training phase. Inspired by the optical flow for motion alignment between adjacent frames, we design a AFNet with structural stability loss and residual-based smoothness loss to learn the appearance flow across adjacent frames, which handles the brightness inconsistency issue efficaciously. In addition, we propose a novel self-attention mechanism named feature scaling module to alleviate the inadequate representation learning problem. In a comparison study to the current state-of-the-art self-supervised methods explored for urban videos on the SCARED dataset, the developed model surpasses existing methods by a large margin.

Original languageEnglish
Title of host publication2021 IEEE International Conference on Robotics and Automation, ICRA 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages7159-7165
Number of pages7
ISBN (Electronic)9781728190778
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Robotics and Automation, ICRA 2021 - Xi'an, China
Duration: 30 May 20215 Jun 2021

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
Volume2021-May
ISSN (Print)1050-4729

Conference

Conference2021 IEEE International Conference on Robotics and Automation, ICRA 2021
Country/TerritoryChina
CityXi'an
Period30/05/215/06/21

Fingerprint

Dive into the research topics of 'Self-Supervised Learning for Monocular Depth Estimation on Minimally Invasive Surgery Scenes'. Together they form a unique fingerprint.

Cite this