TY - GEN
T1 - Temporal Enhanced Hybrid Neural Representation for Video Compression
AU - Wang, Jinxiang
AU - Liu, Yangdong
AU - Zhu, Shiping
AU - Feng, Cheng
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Implicit neural representation methods are employed to model each video, and they can be broadly categorized into two groups: index-based methods and hybrid methods. Index-based NeRVs generate embeddings solely based on frame indices, lacking specific information about the video content. Conversely, hybrid NeRVs solely generate video content embeddings, disregarding the positive impact of temporal cues during the fitting process. To address these limitations, we propose a novel approach called Temporal Enhanced Hybrid Neural Representation for Videos (TNeRV). TNeRV incorporates temporal modulation and diversity exploration to enhance the fitting process of the decoder. Initially, we introduce the Temporal Diversity Exploration (TDE) block to generate video-diversity embeddings in addition to the video-specific embeddings, enabling the decoder to accurately perceive and adapt to temporal changes within the video. Next, we design the Temporal Modulation Fusion (TMF) block, which combines the two types of embeddings and integrates temporal cues to improve the fitting performance of the decoder. Finally, we conduct a comprehensive evaluation of TNeRV against state-of-the-art methods in video regression and video compression tasks, demonstrating that TNeRV outperforms existing implicit methods.
AB - Implicit neural representation methods are employed to model each video, and they can be broadly categorized into two groups: index-based methods and hybrid methods. Index-based NeRVs generate embeddings solely based on frame indices, lacking specific information about the video content. Conversely, hybrid NeRVs solely generate video content embeddings, disregarding the positive impact of temporal cues during the fitting process. To address these limitations, we propose a novel approach called Temporal Enhanced Hybrid Neural Representation for Videos (TNeRV). TNeRV incorporates temporal modulation and diversity exploration to enhance the fitting process of the decoder. Initially, we introduce the Temporal Diversity Exploration (TDE) block to generate video-diversity embeddings in addition to the video-specific embeddings, enabling the decoder to accurately perceive and adapt to temporal changes within the video. Next, we design the Temporal Modulation Fusion (TMF) block, which combines the two types of embeddings and integrates temporal cues to improve the fitting performance of the decoder. Finally, we conduct a comprehensive evaluation of TNeRV against state-of-the-art methods in video regression and video compression tasks, demonstrating that TNeRV outperforms existing implicit methods.
KW - Hybrid neural representation for videos
KW - temporal diversity exploration
KW - temporal modulation
UR - https://www.scopus.com/pages/publications/85197678918
U2 - 10.1109/PCS60826.2024.10566352
DO - 10.1109/PCS60826.2024.10566352
M3 - 会议稿件
AN - SCOPUS:85197678918
T3 - 2024 Picture Coding Symposium, PCS 2024 - Proceedings
BT - 2024 Picture Coding Symposium, PCS 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 Picture Coding Symposium, PCS 2024
Y2 - 12 June 2024 through 14 June 2024
ER -