跳到主要导航 跳到搜索 跳到主要内容

Joint discriminative representation learning for end-to-end person search

  • Pengcheng Zhang
  • , Xiaohan Yu
  • , Xiao Bai*
  • , Chen Wang
  • , Jin Zheng
  • , Xin Ning
  • *此作品的通讯作者
  • Beihang University
  • Macquarie University
  • Griffith University Queensland
  • CAS - Institute of Semiconductors

科研成果: 期刊稿件文章同行评审

摘要

Person search simultaneously detects and retrieves a query person from uncropped scene images. Existing methods are either two-step or end-to-end. The former employs two standalone models for the two sub-tasks, while the latter conducts person search with a unified model. Despite encouraging progress, most existing end-to-end methods focus on balancing the model between detection and retrieval sub-tasks, while ignoring to enhance the learned representation for retrieval, which leads to inferior accuracy to two-step approaches. To that end, we propose a novel hierarchical framework that jointly optimizes instance-aware and part-aware embedding to enable discriminative representation learning. Specifically, we develop a region-of-interest cosegment (ROICoseg) module that captures part-aware information without requiring extra annotations to enable fine-grained discriminative representation. On top of that, a Contextual Instance Batch Sampling (CIBS) method is introduced to effectively employ contextual information for constructing training batches, thus facilitating effective instance-aware representation learning. We further introduce the first cross-door person search dataset (CDPS) that retrieves a target person in outdoor cameras with an indoor captured image or vice versa. Extensive experiments show that our proposed model achieves competitive performance on CUHK-SYSU and outperforms state-of-the-art end-to-end methods on the more challenging PRW and CDPS.1

源语言英语
文章编号110053
期刊Pattern Recognition
147
DOI
出版状态已出版 - 3月 2024

指纹

探究 'Joint discriminative representation learning for end-to-end person search' 的科研主题。它们共同构成独一无二的指纹。

引用此