Gene selection for cancer classification in microarray data

  • Lijuan Zhang*
  • , Zhoujun Li
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Microarray data has been widely and successfully applied to cancer classification, where the purpose is to classify and predict the diagnostic category of a sample by its gene expression profile. A typical microarray dataset consists of expression levels for a large number (usually thousands or ten thousands) of genes on a relatively small number (often less than one hundred) of samples. Of the tens of thousands of genes, only a small number of them are contributing to cancer classification. As a consequence, one basic and important question associated with cancer classification is to identify a small subset of informative genes contributing the most to the classification task. This procedure is usually called gene selection. Gene selection has been widely studied in statistical pattern recognition, machine learning and data mining. The authors attempt to review the field of gene selection based on their earlier work, introduce the background and the two basic concepts (gene relevance, relevance measure) of gene selection, categorize the existing gene selection methods from statistics, machine learning and data mining areas, demonstrate the performance of several representative gene selection algorithms through an empirical study using public microarray data, identify the existing problems of gene selection, and point out current trends and feature directions.

Original languageEnglish
Pages (from-to)794-802
Number of pages9
JournalJisuanji Yanjiu yu Fazhan/Computer Research and Development
Volume46
Issue number5
StatePublished - May 2009

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Cancer classification
  • Gene relevance
  • Gene selection
  • Microarray data
  • Relevance measure

Fingerprint

Dive into the research topics of 'Gene selection for cancer classification in microarray data'. Together they form a unique fingerprint.

Cite this