Abstract
Text-based person retrieval is the process of searching a massive visual resource library for images of a particular pedestrian, based on a textual query. Existing approaches often suffer from a problem of color (CLR) over-reliance, which can result in a suboptimal person retrieval performance by distracting the model from other important visual cues such as texture and structure information. To handle this problem, we propose a novel framework to Excavate All-round Information Beyond Color for the task of text-based person retrieval, which is therefore termed EAIBC. The EAIBC architecture includes four branches, namely an RGB branch, a grayscale (GRS) branch, a high-frequency (HFQ) branch, and a CLR branch. Furthermore, we introduce a mutual learning (ML) mechanism to facilitate communication and learning among the branches, enabling them to take full advantage of all-round information in an effective and balanced manner. We evaluate the proposed method on three benchmark datasets, including CUHK-PEDES, ICFG-PED ES, and RSTPReid. The experimental results demonstrate that EAIBC significantly outperforms existing methods and achieves state-of-the-art (SOTA) performance in supervised, weakly supervised, and cross-domain settings.
| Original language | English |
|---|---|
| Pages (from-to) | 5097-5111 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Neural Networks and Learning Systems |
| Volume | 36 |
| Issue number | 3 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Color (CLR) information
- cross-modal retrieval
- frequency
- person reidentification (ReID)
- text-based person retrieval
Fingerprint
Dive into the research topics of 'Improving Text-Based Person Retrieval by Excavating All-Round Information Beyond Color'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver