Molecular Classification of Multiple Tumor Types

Bioinformatics 17(Suppl. 1):S316-S322. 2001. Published: 2001.07.20

Chen-Hsiang Yeang, Sridhar Ramaswamy, Pablo Tamayo, Sayan Mukherjee, Ryan M. Rifkin, Michael Angelo, Michael Reich, Eric Lander, Jill P. Mesirov, and Todd Golub

Read Manuscript


Using gene expression data to classify tumor types is a very promising tool in cancer diagnosis. Previous works show several pairs of tumor types can be successfully distinguished by their gene expression patterns. However, the simultaneous classification across a heterogeneous set of tumor types has not been well studied yet. We obtained 190 samples from 14 tumor classes and generated a combined expression dataset containing 16063 genes for each of those samples. We performed multi-class classification by combining the outputs of binary classifiers. Three binary classifiers (k-nearest neighbors, weighted voting, and support vector machines) were applied in conjunction with three combination scenarios (one-vs-all, all-pairs, hierarchical partitioning). We achieved the best cross validation error rate of 18.75% and the best test error rate of 21.74% by using the one-vs-all support vector machine algorithm. The results demonstrate the feasibility of performing clinically useful classification from samples of multiple tumor types.

Keywords: Cancer classification cancer genomics molecular classification molecular pattern recognition

Pred calls

Supplemental Data

Description Link/Filename
Paper (PDF) Bioinformatics_200107.pdf
Same datasets as this paper