跳到主要导航 跳到搜索 跳到主要内容

GAN-Based virtual-to-real image translation for urban scene semantic segmentation

  • Beihang University
  • Beijing Innovation Center for Mobility Intelligent
  • China Transinfo Technology Corp

科研成果: 期刊稿件文章同行评审

摘要

Semantic image segmentation requires large amounts of pixel-wise labeled training data. Creating such data generally requires labor-intensive human manual annotation. Thus, extracting training data from video games is a practical idea, and pixel-wise annotation can be automated from video games with near perfect accuracy. However, experiments show that models trained using raw video-game data cannot be directly applied to real-world scenes because of the domain shift problem. In this paper, we propose a domain-adaptive network based on CycleGAN that translates scenes from a virtual domain to a real domain in both the pixel and feature spaces. Our contributions are threefold: 1) we propose a dynamic perceptual network to improve the quality of the generated images in the feature spaces, making the translated images are more conducive to semantic segmentation; 2) we introduce a novel weighted self-regularization loss to prevent semantic changes in translated images; and 3) we design a discrimination mechanism to coordinate multiple subnetworks and improve the overall training efficiency. We devise a series of metrics to evaluate the quality of translated images during our experiments on the public GTA-V (a video game dataset, i.e., the virtual domain) and Cityscapes (a real-world dataset, i.e., the real domain) and achieved notably improved results, demonstrating the efficacy of the proposed model.

源语言英语
页(从-至)127-135
页数9
期刊Neurocomputing
394
DOI
出版状态已出版 - 21 6月 2020

指纹

探究 'GAN-Based virtual-to-real image translation for urban scene semantic segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此