CAFI-Pillars: Infusing Geometry Priors for Pillar-Based 3D Detectors Through Centroid-Aware Feature Interaction

  • Chao Wang
  • , Zhiwei Liu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Pillar-based feature learning patterns have demonstrated high efficiency for 3D object detection, fostering the research of environmental perception for autonomous vehicles. However, aggressive downsampling during pillarization leads to a problem of lacking explicit geometry clues in pillar vectors. To address this limitation, we consider infusing more geometric information as prior knowledge for pillar-based feature abstraction. Our approach, termed CAFI-Pillars, employs a centroid-aware feature interaction (CAFI) unit as a core component, which explores a combined representation of pillar-wise semantical features and point-wise geometry clues. Concretely, CAFI-Pillars adopts a novel two-pathway topology consisting of a semantic branch and a geometric branch. Intra-branch feature abstraction relies on a sparse-to-dense manner, with a hybrid CNN-Transformer (HCT) detection neck utilized on densified features to enrich the diversity of feature representation. Meanwhile, inter-branch feature completion is facilitated by a task-guided bilateral feature fusion (TGBFF) module, which serves as a bridge for feature sharing and exchange. These innovative techniques enable CAFI-Pillars to obtain competitive results on the WOD and KITTI 3D detection benchmarks, highlighting its superior engineering potential for environmental perception tasks of autonomous vehicles.

Original languageEnglish
Pages (from-to)2399-2408
Number of pages10
JournalIEEE Transactions on Intelligent Vehicles
Volume9
Issue number1
DOIs
StatePublished - 1 Jan 2024

Keywords

  • 3D object detection
  • Autonomous vehicles
  • environmental perception
  • feature complementarity
  • pillar-based feature learning

Fingerprint

Dive into the research topics of 'CAFI-Pillars: Infusing Geometry Priors for Pillar-Based 3D Detectors Through Centroid-Aware Feature Interaction'. Together they form a unique fingerprint.

Cite this