Skip to main navigation Skip to search Skip to main content

Exploring Spatiotemporal Consistency of Features for Video Translation in Consumer Internet of Things

  • Haichuan Tang*
  • , Zhenjie Yu
  • , Shuang Li
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Video data has emerged as a primary source of information input in contemporary CIoT systems, significantly driving its development. However, due to the diversity of video capture devices, videos exhibit significant heterogeneity in various aspects, such as color, texture, and lighting conditions, posing challenges for video manipulation and analysis. Moreover, different information processing terminals have limited requirements for data types, which has led to a demand for the translation of heterogeneous videos. In this paper, we propose a novel method named Structure and Motion Consistency Network (SMCN). It updates and optimizes the model from the feature level, making it more efficient at extracting invariant spatiotemporal information from different types of video data. Specifically, it fuses the structure information, a.k.a. mean and standard deviation of features at each spatial position across channels, then re-injects it to refine the spatial consistency, and maximizes motion mutual information of features from adjacent frames to improve the temporal consistency of intermediate features. We conducted experiments on the common video translation dataset Viper and the infrared-to-visible video translation dataset IRVI. Extensive experiments indicate our SMCN outperforms the state-of-the-art methods and the lightweight module can be easily applied to other models in a plug-and-play manner, showing significant advantages in addressing the problem of heterogeneous video data transformation.

Original languageEnglish
Pages (from-to)3077-3087
Number of pages11
JournalIEEE Transactions on Consumer Electronics
Volume70
Issue number1
DOIs
StatePublished - 1 Feb 2024
Externally publishedYes

Keywords

  • Generative adversarial networks
  • intermediate features
  • internet of Things
  • spatiotemporal consistency
  • unpaired video translation

Fingerprint

Dive into the research topics of 'Exploring Spatiotemporal Consistency of Features for Video Translation in Consumer Internet of Things'. Together they form a unique fingerprint.

Cite this