跳到主要导航 跳到搜索 跳到主要内容

Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing

  • Wangyang Ying
  • , Dongjie Wang
  • , Kunpeng Liu
  • , Leilei Sun
  • , Yanjie Fu*
  • *此作品的通讯作者
  • Arizona State University
  • University of Central Florida
  • Portland State University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Feature generation aims to generate new and meaningful features to create a discriminative representation space. A generated feature is meaningful when the generated feature is from a feature pair with inherent feature interaction. In the real world, experienced data scientists can identify potentially useful feature-feature interactions, and generate meaningful dimensions from an exponentially large search space in an optimal crossing form over an optimal generation path. But, machines have limited human-like abilities. We generalize such learning tasks as self-optimizing feature generation. Self-optimizing feature generation imposes several under-addressed challenges on existing systems: meaningful, robust, and efficient generation. To tackle these challenges, we propose a principled and generic representation-crossing framework to solve self-optimizing feature generation. To achieve hashing representation, we propose a three-step approach: feature discretization, feature hashing, and descriptive summarization. To achieve reinforcement crossing, we develop a hierarchical reinforcement feature crossing approach. We present extensive experimental results to demonstrate the effectiveness and efficiency of the proposed method. The code is available at https://github.com/yingwangyang/HRC_feature_cross.git.

源语言英语
主期刊名Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023
编辑Guihai Chen, Latifur Khan, Xiaofeng Gao, Meikang Qiu, Witold Pedrycz, Xindong Wu
出版商Institute of Electrical and Electronics Engineers Inc.
748-757
页数10
ISBN(电子版)9798350307887
DOI
出版状态已出版 - 2023
活动23rd IEEE International Conference on Data Mining, ICDM 2023 - Shanghai, 中国
期限: 1 12月 20234 12月 2023

出版系列

姓名Proceedings - IEEE International Conference on Data Mining, ICDM
ISSN(印刷版)1550-4786

会议

会议23rd IEEE International Conference on Data Mining, ICDM 2023
国家/地区中国
Shanghai
时期1/12/234/12/23

指纹

探究 'Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing' 的科研主题。它们共同构成独一无二的指纹。

引用此