Skip to main navigation Skip to search Skip to main content

Anomaly-aware self-supervised feature learning for weakly supervised video anomaly detection

  • Beihang University
  • National Computer Network Emergency Response Technical Team/Coordination Center of China

Research output: Contribution to journalArticlepeer-review

Abstract

Weakly supervised video anomaly detection (WSVAD) aims to achieve frame-level anomaly detection utilizing video-level labeled training data. Most of the WSVAD methods typically focus on developing various anomaly detectors and employ the models pretrained on general large-scale databases (e.g. Kinetics-400) for video feature extraction. However, such features are pretrained in a supervised manner to recognize significantly distinct human actions, making them sub-optimal for the WSVAD task, which involves detecting more complex events including subtle human and non-human motions. Unlike previous efforts, in this work, we address WSVAD from another fundamental perspective of feature learning, and considering the anomaly attributes, we propose an anomaly-aware self-supervised learning (SSL) based method. Specifically, we design a series of pretext tasks, including temporal order verification, speed prediction, arrow of time prediction, and abrupt change detection, to directly pretrain the model on the anomaly datasets, thus rendering features tailored for the WSVAD task. Moreover, to deal with the limitation of the existing methods which adapt the pretrained features to anomaly detection by utilizing top-k anomaly-scored samples, we present a hard instance mining strategy (HIMS) to additionally explore valuable cues from unused non-top-k ones, enhancing the discriminability between normal and abnormal instances. Experimental results clearly demonstrate that our method outperforms state-of-the-art counterparts on the UCF-Crime, ShanghaiTech and XD-Violence benchmark datasets, highlighting its effectiveness.

Original languageEnglish
Article number104379
JournalComputer Vision and Image Understanding
Volume257
DOIs
StatePublished - Jun 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 16 - Peace, Justice and Strong Institutions
    SDG 16 Peace, Justice and Strong Institutions

Keywords

  • Self-supervised representation learning
  • Video anomaly detection
  • Weakly supervised learning

Fingerprint

Dive into the research topics of 'Anomaly-aware self-supervised feature learning for weakly supervised video anomaly detection'. Together they form a unique fingerprint.

Cite this