TY - GEN
T1 - A generic framework for event detection in various video domains
AU - Zhang, Tianzhu
AU - Xu, Changsheng
AU - Zhu, Guangyu
AU - Liu, Si
AU - Lu, Hanqing
PY - 2010
Y1 - 2010
N2 - Event detection is essential for the extensively studied video analysis and understanding area. Although various approaches have been proposed for event detection, there is a lack of a generic event detection framework that can be applied to various video domains (e.g. sports, news, movies, surveillance). In this paper, we present a generic event detection approach based on semi-supervised learning and Internet vision. Concretely, a Graph-based Semi-Supervised Multiple Instance Learning (GSSMIL) algorithm is proposed to jointly explore small-scale expert labeled videos and large-scale unlabeled videos to train the event models to detect video event boundaries. The expert labeled videos are obtained from the analysis and alignment of well-structured video related text (e.g. movie scripts, web-casting text, close caption). The unlabeled data are obtained by querying related events from the video search engine (e.g. YouTube) in order to give more distributive information for event modeling. A critical issue of GSSMIL in constructing a graph is the weight assignment, where the weight of an edge specifies the similarity between two data points. To tackle this problem, we propose a novel Multiple Instance Learning Induced Similarity (MILIS) measure by learning instance sensitive classifiers. We perform the thorough experiments in three popular video domains: movies, sports and news. The results compared with the state-of-the-arts are promising and demonstrate our proposed approach is performance-effective.
AB - Event detection is essential for the extensively studied video analysis and understanding area. Although various approaches have been proposed for event detection, there is a lack of a generic event detection framework that can be applied to various video domains (e.g. sports, news, movies, surveillance). In this paper, we present a generic event detection approach based on semi-supervised learning and Internet vision. Concretely, a Graph-based Semi-Supervised Multiple Instance Learning (GSSMIL) algorithm is proposed to jointly explore small-scale expert labeled videos and large-scale unlabeled videos to train the event models to detect video event boundaries. The expert labeled videos are obtained from the analysis and alignment of well-structured video related text (e.g. movie scripts, web-casting text, close caption). The unlabeled data are obtained by querying related events from the video search engine (e.g. YouTube) in order to give more distributive information for event modeling. A critical issue of GSSMIL in constructing a graph is the weight assignment, where the weight of an edge specifies the similarity between two data points. To tackle this problem, we propose a novel Multiple Instance Learning Induced Similarity (MILIS) measure by learning instance sensitive classifiers. We perform the thorough experiments in three popular video domains: movies, sports and news. The results compared with the state-of-the-arts are promising and demonstrate our proposed approach is performance-effective.
KW - broadcast video
KW - event detection
KW - internet
KW - multiple instance learning
KW - semi-supervised learning
KW - web-casting text
UR - https://www.scopus.com/pages/publications/78650993078
U2 - 10.1145/1873951.1873967
DO - 10.1145/1873951.1873967
M3 - 会议稿件
AN - SCOPUS:78650993078
SN - 9781605589336
T3 - MM'10 - Proceedings of the ACM Multimedia 2010 International Conference
SP - 103
EP - 112
BT - MM'10 - Proceedings of the ACM Multimedia 2010 International Conference
T2 - 18th ACM International Conference on Multimedia ACM Multimedia 2010, MM'10
Y2 - 25 October 2010 through 29 October 2010
ER -