Class prediction and discovery using gene expression data

RECOMB 2000, p263-272, 2000. Published: 2000.04.10

Donna K. Slonim, Pablo Tamayo, Jill P. Mesirov, Todd R. Golub and Eric S. Lander

Read Manuscript

Abstract

Classification of patient samples is a crucial aspect of cancer diagnosis and treatment. We present a method for classifying samples by computational analysis of gene expression data. We consider the classification problem in two parts: class discovery and class prediction . Class discovery refers to the process of dividing samples into reproducible classes that have similar behavior or properties, while class prediction places new samples into already known classes. We describe a method for performing class prediction and illustrate its strength by correctly classifying bone marrow and blood samples from acute leukemia patients. We also describe how to use our predictor to validate newly discovered classes, and we demonstrate how this technique could have discovered the key distinctions among leukemias if they were not already known. This proof-of-concept experiment paves the way for a wealth of future work on the molecular classification and understanding of disease.

Keywords: Leukemia, ALL, AML, gene expression, prediction, class discovery, gene marker, molecular classification, supervised, unsupervised.

All aml clust

Supplemental Data

Description Link/Filename
Same datasets as this paper http://www-genome.wi.mit.edu/cgi-bin/cancer/publications/pub_paper.cgi?mode=view&paper_id=43
Paper (PDF) Slonim_et_al_2000.pdf
Paper (PS) Slonim_et_al_2000.ps