Abstract
Due to the complexity of real-world environments, self-localization remains critical yet unresolved challenges for individuals with visual impairments during travel. Visual appearance variations in the context of assistive technology, such as season changes, illumination changes, viewpoint changes, and dynamic occlusions, significantly hinder the performance of place recognition. This paper proposes a novel assistive visual localization method to address these challenges. In order to extract landmark-related features from images with appearance variations, the dual constraints of place classification and feature distillation are proposed based on large-scale place recognition and human matting datasets. Additionally, online sequential matching is employed for place recognition, leveraging temporal consistency embedded in multi-frame sequences to further eliminate erroneous localization results. Evaluated on the large-scale SF-XL dataset augmented with human matting, the proposed image feature model achieves a 3% improvement in Recall@1 compared to state-of-the-art approaches using similar backbone architectures, which indicates the better performance of image retrieval under the assistive occlusion scenarios. More importantly, in real-world validation using self-collected assistive datasets, the proposed visual localization pipeline incorporating sequential matching achieves F1 scores over 0.85 and shows advantages over existing sequential place recognition methods. The implementation codes of the proposed algorithm, along with a real-world testing dataset for assistive localization, are released at https://github.com/chengricky/AssistivePlace .
| Original language | English |
|---|---|
| Article number | 104623 |
| Journal | Computer Vision and Image Understanding |
| Volume | 263 |
| DOIs | |
| State | Published - Jan 2026 |
Keywords
- Appearance variations
- Assistive visual localization
- Image feature
- Place recognition
Fingerprint
Dive into the research topics of 'Place recognition for visual assistive localization under challenging visual appearance variations'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver