Skip to main navigation Skip to search Skip to main content

Tower of Knowledge for scene interpretation: A survey

  • Beihang University

Research output: Contribution to journalArticlepeer-review

Abstract

The past few decades have witnessed a wealth of promising work in making machines interpret the scenes around us. However, scene interpretation is still in its infancy, in comparison with human cognition. As such, human language, a highly developed output of human cognition, can be seen as an important cue towards scene interpretation. We survey in this paper Tower of Knowledge (ToK) approaches, which take advantage of human language, for scene interpretation. The core of ToK approaches is a multi-layer architecture, namely ToK architecture, aiming to establish the information flow of scene interpretation. In general, ToK architecture can be applied in scene interpretation by exploiting its either vertical or horizontal connections. First, we focus on the approaches with respect to the vertical connections in ToK architecture. In such approaches, the optimal label is assigned to each identified object in a scene, on the basis of verifying whether the object has the right characteristics to fulfil the functions a label implies. Second, we discuss the approaches on utilising the horizontal connections of ToK architecture to interpret a scene, according to the asymmetric spatial relationships of the objects. In retrospect of what has been achieved so far, we finally outlook what the future may hold for ToK.

Original languageEnglish
Pages (from-to)42-48
Number of pages7
JournalPattern Recognition Letters
Volume48
DOIs
StatePublished - 15 Oct 2014

Keywords

  • Computer vision
  • Scene interpretation
  • Tower of Knowledge

Fingerprint

Dive into the research topics of 'Tower of Knowledge for scene interpretation: A survey'. Together they form a unique fingerprint.

Cite this