跳到主要导航 跳到搜索 跳到主要内容

Efficient 3D object annotation via vision-derived pseudo-LiDAR and Vision Language Model (VLM) validation

  • Yalong Ma
  • , Ziying Yao
  • , Xuan Liu
  • , Zhongxia Xiong
  • , Xiaozheng He
  • , Xinkai Wu*
  • *此作品的通讯作者
  • Beihang University
  • Rensselaer Polytechnic Institute

科研成果: 期刊稿件文章同行评审

摘要

To advance autonomous driving, accurate 3D object annotation is crucial for target recognition, environment perception, and high-precision map construction. However, producing high-quality 3D annotated data is costly and time-consuming. In particular, for sparse point cloud data, it is both labor-intensive and error-prone to annotate 3D objects. To address this challenge, this paper proposes an efficient automated annotation pipeline that integrates pseudo-point cloud generation with validation using a vision language model (VLM). Our approach supplements sparse point cloud data, generates pseudo-labels, and leverages a VLM model to validate and filter annotations, thereby creating a closed-loop automated system. Experiments on a real-world dataset collected by an autonomous vehicle demonstrate significant improvements in annotation accuracy and efficiency.

源语言英语
文章编号105429
期刊Transportation Research Part C: Emerging Technologies
182
DOI
出版状态已出版 - 1月 2026

指纹

探究 'Efficient 3D object annotation via vision-derived pseudo-LiDAR and Vision Language Model (VLM) validation' 的科研主题。它们共同构成独一无二的指纹。

引用此