TY - JOUR
T1 - Tower of Knowledge for scene interpretation
T2 - A survey
AU - Xu, Mai
AU - Wang, Zulin
AU - Petrou, Maria
PY - 2014/10/15
Y1 - 2014/10/15
N2 - The past few decades have witnessed a wealth of promising work in making machines interpret the scenes around us. However, scene interpretation is still in its infancy, in comparison with human cognition. As such, human language, a highly developed output of human cognition, can be seen as an important cue towards scene interpretation. We survey in this paper Tower of Knowledge (ToK) approaches, which take advantage of human language, for scene interpretation. The core of ToK approaches is a multi-layer architecture, namely ToK architecture, aiming to establish the information flow of scene interpretation. In general, ToK architecture can be applied in scene interpretation by exploiting its either vertical or horizontal connections. First, we focus on the approaches with respect to the vertical connections in ToK architecture. In such approaches, the optimal label is assigned to each identified object in a scene, on the basis of verifying whether the object has the right characteristics to fulfil the functions a label implies. Second, we discuss the approaches on utilising the horizontal connections of ToK architecture to interpret a scene, according to the asymmetric spatial relationships of the objects. In retrospect of what has been achieved so far, we finally outlook what the future may hold for ToK.
AB - The past few decades have witnessed a wealth of promising work in making machines interpret the scenes around us. However, scene interpretation is still in its infancy, in comparison with human cognition. As such, human language, a highly developed output of human cognition, can be seen as an important cue towards scene interpretation. We survey in this paper Tower of Knowledge (ToK) approaches, which take advantage of human language, for scene interpretation. The core of ToK approaches is a multi-layer architecture, namely ToK architecture, aiming to establish the information flow of scene interpretation. In general, ToK architecture can be applied in scene interpretation by exploiting its either vertical or horizontal connections. First, we focus on the approaches with respect to the vertical connections in ToK architecture. In such approaches, the optimal label is assigned to each identified object in a scene, on the basis of verifying whether the object has the right characteristics to fulfil the functions a label implies. Second, we discuss the approaches on utilising the horizontal connections of ToK architecture to interpret a scene, according to the asymmetric spatial relationships of the objects. In retrospect of what has been achieved so far, we finally outlook what the future may hold for ToK.
KW - Computer vision
KW - Scene interpretation
KW - Tower of Knowledge
UR - https://www.scopus.com/pages/publications/84906781752
U2 - 10.1016/j.patrec.2014.02.009
DO - 10.1016/j.patrec.2014.02.009
M3 - 文章
AN - SCOPUS:84906781752
SN - 0167-8655
VL - 48
SP - 42
EP - 48
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -