Skip to main navigation Skip to search Skip to main content

Imitation Learning Based on Visual-text Fusion for Robotic Sorting Tasks

  • Beihang University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose an imitation learning method based visual-text fusion for manipulation task. Manipulation is predicted based on text instructions by abstracting the manipulation into text instructions, learning the semantic concepts in the text instructions, and combining them with spatial features for visual inference. The construction process and demonstration content of the expert demonstration dataset is described in detail, which is focused on the process of decomposing the operation task through text. In addition, we present the learning process and demonstrate the network structure of functional modules to highlight the fusion of text features with visual features. The effectiveness of this method is verified by a simulated learning experiment on a multi-step manipulation task. The results show that the behavioral strategy achieved a 92.19% task completion rate on known objects and 80.03% on unknown objects. It is proved that, owing to the introduction of text, the decomposition of the operational task in terms of abstract semantics is realized and the difficulty of learning is reduced. Meanwhile, the behavioral strategy can perform accurate spatial location inference based on text features, thereby achieving accurate action prediction.

Original languageEnglish
Title of host publicationProceedings - 2022 International Conference on Frontiers of Artificial Intelligence and Machine Learning, FAIML 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages157-163
Number of pages7
ISBN (Electronic)9781665473644
DOIs
StatePublished - 2022
Event2022 International Conference on Frontiers of Artificial Intelligence and Machine Learning, FAIML 2022 - Virtual, Online, China
Duration: 19 Jul 202221 Jul 2022

Publication series

NameProceedings - 2022 International Conference on Frontiers of Artificial Intelligence and Machine Learning, FAIML 2022

Conference

Conference2022 International Conference on Frontiers of Artificial Intelligence and Machine Learning, FAIML 2022
Country/TerritoryChina
CityVirtual, Online
Period19/07/2221/07/22

Keywords

  • imitation learning
  • language grounding for robotics
  • vision-based manipu-lation
  • visual-text fusion

Fingerprint

Dive into the research topics of 'Imitation Learning Based on Visual-text Fusion for Robotic Sorting Tasks'. Together they form a unique fingerprint.

Cite this