Abstract
Semantic labeling for indoor scenes has been extensively developed with the wide availability of affordable RGB-D sensors. However, it is still a challenging task for multi-class recognition, especially for “small” objects. In this paper, a novel semantic labeling model based on aggregated features and contextual information is proposed. Given an RGB-D image, the proposed model first creates a hierarchical segmentation using an adapted gPb/UCM algorithm. Then, a support vector machine is trained to predict initial labels using aggregated features, which fuse small-scale appearance features, mid-scale geometric features, and large-scale scene features. Finally, a joint multi-label Conditional random field model that exploits both spatial and attributive contextual relations is constructed to optimize the initial semantic and attributive predicted results. The experimental results on the public NYU v2 dataset demonstrate the proposed model outperforms the existing state-of-the-art methods on the challenging 40 dominant classes task, and the model also achieves a good performance on a recent SUN RGB-D dataset. Especially, the prediction accuracy of “small” classes has been improved significantly.
| Original language | English |
|---|---|
| Pages (from-to) | 1587-1600 |
| Number of pages | 14 |
| Journal | Visual Computer |
| Volume | 33 |
| Issue number | 12 |
| DOIs | |
| State | Published - 1 Dec 2017 |
Keywords
- Aggregated features
- Conditional random field
- Joint optimizing model
- Object attribute
- Semantic scene understanding
Fingerprint
Dive into the research topics of 'Learning aggregated features and optimizing model for semantic labeling'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver