Skip to main navigation Skip to search Skip to main content

Improving Driver Gaze Prediction with Reinforced Attention

  • Kai Lv
  • , Hao Sheng*
  • , Zhang Xiong
  • , Wei Li
  • , Liang Zheng
  • *Corresponding author for this work
  • Beihang University
  • Australian National University

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the task of driver gaze prediction: estimating where the location of the focus of a driver should be, based on a raw video of the outside environment. In practice, we output a probability map that gives the normalized probability of each point in a given scene being the object of the driver attention. Most existing methods (i.e., Coarse-to-Fine and Multi-branch) take an image or a video as input and directly output the fixation map. While successful, these methods can often produce highly scattered predictions, rendering them unreliable for real-world usage. Motivated by this observation, we propose the reinforced attention (RA) model as a regulatory mechanism to increase prediction density. Our method is built directly on top of existing methods, making it complementary to current approaches. Specifically, we first use Multi-branch to obtain an initial fixation map. Then, RA is trained using deep reinforcement learning to learn a location prediction policy, producing a reinforced attention. Finally, in order to obtain the final gaze prediction result, we combine the fixation map and the reinforced attention by a mask-guided multiplication. Experimental results show that our framework improves the accuracy of gaze prediction, and provides state-of-the-art performance on the DR(eye)VE dataset.

Original languageEnglish
Pages (from-to)4198-4207
Number of pages10
JournalIEEE Transactions on Multimedia
Volume23
DOIs
StatePublished - 2021

Keywords

  • Gaze prediction
  • deep learning
  • driver attention
  • reinforcement learning
  • video processing

Fingerprint

Dive into the research topics of 'Improving Driver Gaze Prediction with Reinforced Attention'. Together they form a unique fingerprint.

Cite this