Solving social media text classification problems using code fragment-based XCSR

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Sentiment analysis and spam detection of social media text messages are two challenging data analysis tasks due to sparse and high-dimensional feature vectors. Learning classifier systems (LCS) are rule-based evolutionary computing systems and have limited capabilities to handle real valued sparse high-dimensional big data sets. LCS techniques use interval based representations to handle real valued feature vectors. In the work presented here, interval based representation is replaced by genetic programming based tree like structures to classify high-dimensional real valued text feature vectors. Multiple experiments are conducted on different social media text data sets, i.e.Tweets, movie reviews, amazon and yelp reviews, SMS and Email spam message to evaluate the proposed scheme. Real valued feature vectors are generated from these data sets using term frequency inverse document frequency and/or sentiment lexicons-based features. Results depicts the supremacy of the new encoding scheme over interval based representations in both small and large social media text data sets.

Original languageEnglish
Title of host publicationProceedings - 2017 International Conference on Tools with Artificial Intelligence, ICTAI 2017
PublisherIEEE Computer Society
Pages485-492
Number of pages8
ISBN (Electronic)9781538638767
DOIs
StatePublished - 2 Jul 2017
Event29th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2017 - Boston, United States
Duration: 6 Nov 20178 Nov 2017

Publication series

NameProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
Volume2017-November
ISSN (Print)1082-3409

Conference

Conference29th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2017
Country/TerritoryUnited States
CityBoston
Period6/11/178/11/17

Keywords

  • Learning Classifier Systems
  • Sentiment Analysis
  • Spam Detection
  • Text Classification

Fingerprint

Dive into the research topics of 'Solving social media text classification problems using code fragment-based XCSR'. Together they form a unique fingerprint.

Cite this