Optimizing XCSR for Text Classification

  • Muhammad Hassan Arif
  • , Jianxin Li*
  • , Muhammad Iqbal
  • , Hao Peng
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

XCS, an evolutionary computing technique, can classify data using both bit strings and real valued representations. 'Real valued XCS' (XCSR) commonly uses the min max interval based representation (MMR) for continuous valued data sets. Text data sets can be represented using bag of words based real valued representation, e.g. term frequency inverse document frequency of features. In this work we classify social media short informal text messages using XCSR, for the first time, from two major domains, i.e. spam detection and sentiment analysis. We perform spam detection of SMS and Email messages, and sentiment analysis of reviews and tweets. Feature vectors extracted from short text messages are very sparse and XCSR with MMR representation can not handle sparse data sets very well. We proposed XCSR# that uses MMR representation with explicit 'don't care' intervals to handle sparse social media data sets. The experimental results indicate that introduction of the explicit 'don't care' intervals improved the performance and created a statistically significant impact, specifically in the spam detection data sets. Further, it is observed that XCSR# produced more accurate and general rules than XCSR.

Original languageEnglish
Title of host publicationProceedings - 11th IEEE International Symposium on Service-Oriented System Engineering, SOSE 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages86-95
Number of pages10
ISBN (Electronic)9781509063208
DOIs
StatePublished - 7 Jun 2017
Event11th IEEE International Symposium on Service-Oriented System Engineering, SOSE 2017 - San Francisco, United States
Duration: 6 Apr 20179 Apr 2017

Publication series

NameProceedings - 11th IEEE International Symposium on Service-Oriented System Engineering, SOSE 2017

Conference

Conference11th IEEE International Symposium on Service-Oriented System Engineering, SOSE 2017
Country/TerritoryUnited States
CitySan Francisco
Period6/04/179/04/17

Keywords

  • Sentiment Analysis
  • Spam Detection
  • Text Classification
  • XCSR

Fingerprint

Dive into the research topics of 'Optimizing XCSR for Text Classification'. Together they form a unique fingerprint.

Cite this