TY - GEN
T1 - Deep people counting in extremely dense crowds
AU - Wang, Chuan
AU - Zhang, Hua
AU - Yang, Liang
AU - Liu, Si
AU - Cao, Xiaochun
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/10/13
Y1 - 2015/10/13
N2 - People counting in extremely dense crowds is an important step for video surveillance and anomaly warning. The prob-lem becomes especially more challenging due to the lack of training samples, severe occlusions, cluttered scenes and variation of perspective. Existing methods either resort to auxiliary human and face detectors or surrogate by estimat-ing the density of crowds. Most of them rely on hand-crafted features, such as SIFT, HOG etc, and thus are prone to fail when density grows or the training sample is scarce. In this paper we propose an end-To-end deep convolutional neural networks (CNN) regression model for counting peo-ple of images in extremely dense crowds. Our method has following characteristics. Firstly, it is a deep model built on CNN to automatically learn effective features for counting. Besides, to weaken inuence of background like buildings and trees, we purposely enrich the training data with ex-panded negative samples whose ground truth counting is set as zero. With these negative samples, the robustness can be enhanced. Extensive experimental results show that our method achieves superior performance than the state-of-The-Arts in term of the mean and variance of absolute difference.
AB - People counting in extremely dense crowds is an important step for video surveillance and anomaly warning. The prob-lem becomes especially more challenging due to the lack of training samples, severe occlusions, cluttered scenes and variation of perspective. Existing methods either resort to auxiliary human and face detectors or surrogate by estimat-ing the density of crowds. Most of them rely on hand-crafted features, such as SIFT, HOG etc, and thus are prone to fail when density grows or the training sample is scarce. In this paper we propose an end-To-end deep convolutional neural networks (CNN) regression model for counting peo-ple of images in extremely dense crowds. Our method has following characteristics. Firstly, it is a deep model built on CNN to automatically learn effective features for counting. Besides, to weaken inuence of background like buildings and trees, we purposely enrich the training data with ex-panded negative samples whose ground truth counting is set as zero. With these negative samples, the robustness can be enhanced. Extensive experimental results show that our method achieves superior performance than the state-of-The-Arts in term of the mean and variance of absolute difference.
KW - Convolutional neural networks(CNN)
KW - Crowd analysis
KW - People counting
UR - https://www.scopus.com/pages/publications/84962920483
U2 - 10.1145/2733373.2806337
DO - 10.1145/2733373.2806337
M3 - 会议稿件
AN - SCOPUS:84962920483
T3 - MM 2015 - Proceedings of the 2015 ACM Multimedia Conference
SP - 1299
EP - 1302
BT - MM 2015 - Proceedings of the 2015 ACM Multimedia Conference
PB - Association for Computing Machinery, Inc
T2 - 23rd ACM International Conference on Multimedia, MM 2015
Y2 - 26 October 2015 through 30 October 2015
ER -