Abstract
Sentiment analysis of public views and spam detection from social media text messages are two challenging data analysis tasks due to short informal text. This paper investigates the performance of learning classifier systems (LCS), which are rule-based machine learning techniques, in sentiment analysis of twitter messages and movie reviews, and spam detection from SMS and email data sets. In this study, an existing LCS technique is extended by introducing a novel encoding scheme to represent classifier rules in order to handle the sparseness in feature vectors, which are generated using the term frequency inverse document frequency of word n-grams and sentiment lexicons. The obtained results show that the proposed encoding scheme smoothed the learning process and generated consistently good results in all experiments conducted in this study.
| Original language | English |
|---|---|
| Pages (from-to) | 7281-7291 |
| Number of pages | 11 |
| Journal | Soft Computing |
| Volume | 22 |
| Issue number | 21 |
| DOIs | |
| State | Published - 1 Nov 2018 |
Keywords
- High-dimensional
- Learning classifier systems
- Sentiment analysis
- Spam detection
- Sparseness
Fingerprint
Dive into the research topics of 'Sentiment analysis and spam detection in short informal text using learning classifier systems'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver