PNAS 98: 15149-15154. Published: 2001.12.17
Sridhar Ramaswamy, Pablo Tamayo, Ryan Rifkin, Sayan Mukherjee, Chen-Hsiang Yeang, Michael Angelo, Christine Ladd, Michael Reich, Eva Latulippe , Jill P. Mesirov, Tomaso Poggio, William Gerald , Massimo Loda, Eric S. Lander, Todd R. Golub
Read ManuscriptThe optimal treatment of cancer patients depends on establishing accurate diagnoses using a complex combination of clinical and histopathologic data. In some instances this is difficult or impossible due to atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multi-class classifier based on a Support Vector Machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared to their well-differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multi-class molecular cancer classification, and suggest a strategy for future clinical implementation of molecular cancer diagnostics.