News title classification with support from auxiliary long texts

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The performance of short text classification is limited due to its intrinsic shortness of sentences which causes the sparseness of vector space model. Traditional classifiers like SVM are extremely sensitive to the features space, thereby making classification performance unsatisfying in short text related applications. It is believed that using external information to help better represent input data would possibly yield satisfying results. In this paper, we target on the problem of news title classification which is an essential and typical member in short text family and propose an approach which employs external information from long text to address the problem the sparseness. Afterwards Restricted Boltzman Machine are utilised to select features and then finally perform classification using Support Vector Machine. The experimental study on Reuters-21578 and Sogou Chinese news corpus has demonstrates the effectiveness of the proposed method.

Original languageEnglish
Title of host publicationNeural Information Processing - 21st International Conference, ICONIP 2014, Proceedings
EditorsChu Kiong Loo, Keem Siah Yap, Kok Wai Wong, Andrew Teoh, Kaizhu Huang
PublisherSpringer Verlag
Pages581-588
Number of pages8
ISBN (Electronic)9783319126395
DOIs
StatePublished - 2014
Event21st International Conference on Neural Information Processing, ICONIP 2014 - Kuching, Malaysia
Duration: 3 Nov 20146 Nov 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8835
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Neural Information Processing, ICONIP 2014
Country/TerritoryMalaysia
CityKuching
Period3/11/146/11/14

Fingerprint

Dive into the research topics of 'News title classification with support from auxiliary long texts'. Together they form a unique fingerprint.

Cite this