Skip to main navigation Skip to search Skip to main content

Learning Continuous Spatiotemporal Implicit Neural Fields for Unsupervised Video Denoising

  • Xiaowan Hu
  • , Henan Liu
  • , Ce Zheng
  • , Xinyang Li
  • , Mai Xu*
  • *Corresponding author for this work
  • Beihang University
  • Tsinghua University

Research output: Contribution to journalArticlepeer-review

Abstract

Video denoising is fundamental to low-level vision and real-world imaging, yet existing self-supervised methods remain fragile under severe noise and complex motion. Most approaches still rely on spatially and temporally discrete grid-based representations: blind-spot networks enforce J-invariance by masking center pixels with a limited receptive field, while recurrent models build temporal dependencies on discretized frame sequences and noise-sensitive optical flow, leading to error accumulation and motion artifacts. We address this model bottleneck by reformulating self-supervised video denoising as learning a continuous spatiotemporal implicit field. Building on coordinate-based implicit neural representations, we propose a unified video denoising model with a spatiotemporal implicit neural field (SINF). In the spatial domain, blind-spot implicit spatial field maps coordinates directly to pixel-level representations, enabling globally informed texture recovery beyond receptive-field limits. In the temporal domain, an implicit temporal embedding with periodic activations encodes motion continuously over time, while a time-aware spatial graph module refines cross-frame alignment. Together, SINF remodels discretized video signals into a continuous spatiotemporal intensity field, enabling more robust pixel-wise associations than coarse optical flow. Extensive experiments on synthetic and real noisy video benchmarks demonstrate that our SINF achieves state-of-the-art performance on synthetic and real noisy video benchmarks.

Keywords

  • implicit neural representation
  • self-supervised learning
  • spatiotemporal modeling
  • Video denoising

Fingerprint

Dive into the research topics of 'Learning Continuous Spatiotemporal Implicit Neural Fields for Unsupervised Video Denoising'. Together they form a unique fingerprint.

Cite this