Multi-Class Cancer Diagnosis Using Tumor Gene Expression Signatures

PNAS 98: 15149-15154. Published: 2001.12.17

Sridhar Ramaswamy, Pablo Tamayo, Ryan Rifkin, Sayan Mukherjee, Chen-Hsiang Yeang, Michael Angelo, Christine Ladd, Michael Reich, Eva Latulippe , Jill P. Mesirov, Tomaso Poggio, William Gerald , Massimo Loda, Eric S. Lander, Todd R. Golub

Read Manuscript


The optimal treatment of cancer patients depends on establishing accurate diagnoses using a complex combination of clinical and histopathologic data. In some instances this is difficult or impossible due to atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multi-class classifier based on a Support Vector Machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared to their well-differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multi-class molecular cancer classification, and suggest a strategy for future clinical implementation of molecular cancer diagnostics.

Keywords: Cancer Classification Computational Biology Diagnosis Genomics Microarray

Gcm slide

Supplemental Data

Description Link/Filename
CEL files (1/11, 109MB)
Manuscript (PDF) GCM.pdf
CEL files (2/11, 105MB)
Supplementary Information (PDF) PNAS_Supplementary_Information.pdf
CEL files (3/11, 105MB)
CEL files (4/11, 107MB)
GCM_Training.res GCM_Training.res
GCM_Training.cls GCM_Training.cls
CEL files (5/11, 108MB)
GCM_Test.res GCM_Test.res
CEL files (6/11, 110MB)
CEL files (7/11, 108MB)
GCM_Test.cls GCM_Test.cls
CEL files (8/11, 104MB)
GCM_PD.res GCM_PD.res
CEL files (9/11, 104MB)
GCM_PD.cls GCM_PD.cls
CEL files (10/11, 107MB)
GCM_Total.res GCM_Total.res
CEL files (11/11, 89MB)
GCM_Total.cls GCM_Total.cls