Skip to main navigation Skip to search Skip to main content

A new fuzzy decision tree classification method for mining high-speed data streams based on binary search trees

  • Zhoujun Li*
  • , Tao Wang
  • , Ruoxue Wang
  • , Yuejin Yan
  • , Huowang Chen
  • *Corresponding author for this work
  • Peking University
  • National University of Defense Technology
  • Journal of Computer Research and Development

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Decision tree construction is a well-studied problem in data mining. Recently, there has been much interest in mining data streams. Domingos and Hulten have presented a one-pass algorithm for decision tree constructions. Their system using Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. Gama et al. have extended VFDT in two directions. Their system VFDTc can deal with continuous data and use more powerful classification techniques at tree leaves. Peng et al. present soft discretization method to solve continuous attributes in data mining. In this paper, we revisit these problems and implemented a system sVFDT for data stream mining. We make the following contributions: 1) we present a binary search trees (BST) approach for efficiently handling continuous attributes. Its processing time for values inserting is O(nlogn), while VFDTs processing time is O(n 2). 2) We improve the method of getting the best split-test point of a given continuous attribute. Comparing to the method used in VFDTc, it decreases from O(nlogn) to O (n) in processing time. 3) Comparing to VFDTc, sVFDT s candidate split-test number decrease from O(n) to O(logn).4)Improve the soft discretization method to increase classification accuracy in data stream mining.

Original languageEnglish
Title of host publicationFrontiers in Algorithmics - First Annual International Workshop, FAW 2007, Proceedings
PublisherSpringer Verlag
Pages216-227
Number of pages12
ISBN (Print)9783540738138
DOIs
StatePublished - 2007
Externally publishedYes
Event1st International Frontiers in Algorithmics Workshop, FAW 2007 - Lanzhou, China
Duration: 1 Aug 20073 Aug 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4613 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st International Frontiers in Algorithmics Workshop, FAW 2007
Country/TerritoryChina
CityLanzhou
Period1/08/073/08/07

Keywords

  • Binary search tree
  • Continuous attribute
  • Data streams
  • Fuzzy
  • VFDT

Fingerprint

Dive into the research topics of 'A new fuzzy decision tree classification method for mining high-speed data streams based on binary search trees'. Together they form a unique fingerprint.

Cite this