Genome Biology 2002, 3(12):research0073.1?0073.4. Published: 2002.11.24
Alena A. Antipova, Pablo Tamayo, and Todd R. GolubRead Manuscript
One of the factors limiting the number of genes analyzable on high density oligonucleotide arrays is that each transcript is probed by multiple oligonucleotide probes of distinct sequence in order to magnify the sensitivity and specificity of detection. Over the years, the number of probes per gene has decreased, but still no single array for the entire human genome has been reported. To reduce the number of probes required for each gene, a robust systematic approach for choosing the most representative probes is needed. Here, we introduce a generalizable empiric method for reducing the number of probes per gene while maximizing the fidelity to the original array design.
The methodology has been tested on a dataset comprised of 317 Affymetrix HuGeneFL GeneChips. The performance of the original and reduced probe sets was compared in four cancer classification problems. The results of these comparisons demonstrate that the reduction of the probe set by 95% does not dramatically affect performance, and thus illustrate the feasibility of substantially reducing probe numbers without significantly compromising sensitivity and specificity of detection.
The strategy described here is potentially useful for designing small, limited-probe genome-wide arrays for screening applications.
|Description of these files||AboutTheseFiles.doc|
|Paper in pdf format||Antipova_et_al_2002.pdf|
|Raw feature data for all the genes on the chips||RawFeatureData.tar.gz|
|Unscaled Delta(h), random Deltas, and Average Difference||UnscaledResFiles.tar.gz|
|Scaled Delta(h), random Deltas, and Average Difference||ScaledResFiles.tar.gz|
|Cls files, idealized expression vectors for class assignments||ClsFiles.tar.gz|
|Expanded Figure 2||Fig2Features.xls|
|Expanded Table 1, includes classification parameters||Table1Features.xls|
|List of selected Delta(h) probes||ListOfDeltaHprobes.xls|