Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color

  • Aichun Zhu*
  • , Zijie Wang
  • , Jingyi Xue
  • , Xili Wan
  • , Jing Jin
  • , Tian Wang
  • , Hichem Snoussi
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Text-based person retrieval is the process of searching a massive visual resource library for images of a particular pedestrian, based on a textual query. Existing approaches often suffer from a problem of color (CLR) over-reliance, which can result in a suboptimal person retrieval performance by distracting the model from other important visual cues such as texture and structure information. To handle this problem, we propose a novel framework to Excavate All-round Information Beyond Color for the task of text-based person retrieval, which is therefore termed EAIBC. The EAIBC architecture includes four branches, namely an RGB branch, a grayscale (GRS) branch, a high-frequency (HFQ) branch, and a CLR branch. Furthermore, we introduce a mutual learning (ML) mechanism to facilitate communication and learning among the branches, enabling them to take full advantage of all-round information in an effective and balanced manner. We evaluate the proposed method on three benchmark datasets, including CUHK-PEDES, ICFG-PED ES, and RSTPReid. The experimental results demonstrate that EAIBC significantly outperforms existing methods and achieves state-of-the-art (SOTA) performance in supervised, weakly supervised, and cross-domain settings.

Original languageEnglish
Pages (from-to)5097-5111
Number of pages15
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume36
Issue number3
DOIs
StatePublished - 2025

Keywords

  • Color (CLR) information
  • cross-modal retrieval
  • frequency
  • person reidentification (ReID)
  • text-based person retrieval

Fingerprint

Dive into the research topics of 'Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color'. Together they form a unique fingerprint.

Cite this