Skip to main navigation Skip to search Skip to main content

Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing

  • Wangyang Ying
  • , Dongjie Wang
  • , Kunpeng Liu
  • , Leilei Sun
  • , Yanjie Fu*
  • *Corresponding author for this work
  • Arizona State University
  • University of Central Florida
  • Portland State University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Feature generation aims to generate new and meaningful features to create a discriminative representation space. A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities. We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, we propose a principled and generic representation-crossing framework to solve self-optimizing feature generation. To achieve hashing representation, we propose a three-step approach: feature discretization, feature hashing, and descriptive summarization. To achieve reinforcement crossing, we develop a hierarchical reinforcement feature crossing approach. We present extensive experimental results to demonstrate the effectiveness and efficiency of the proposed method. The code is available at https://github.com/yingwangyang/HRC_feature_cross.git.

Original languageEnglish
Title of host publicationProceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023
EditorsGuihai Chen, Latifur Khan, Xiaofeng Gao, Meikang Qiu, Witold Pedrycz, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages748-757
Number of pages10
ISBN (Electronic)9798350307887
DOIs
StatePublished - 2023
Event23rd IEEE International Conference on Data Mining, ICDM 2023 - Shanghai, China
Duration: 1 Dec 20234 Dec 2023

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference23rd IEEE International Conference on Data Mining, ICDM 2023
Country/TerritoryChina
CityShanghai
Period1/12/234/12/23

Keywords

  • Feature Generation
  • Hierarchical Reinforcement Crossing
  • Self-optimizing

Fingerprint

Dive into the research topics of 'Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing'. Together they form a unique fingerprint.

Cite this