Abstract: Current systems for classification of cancer group together tumors with important differences in clinical behavior. As might have been expected from the manifest diversity in clinical behavior, we have found that there is enormous variation in gene expression patterns in tumors that would classically be grouped together. The variation among tumors in global gene expression patterns is, however, orderly and systematic and it provides a distinctive and reproducible signature for each patient s tumor, and even paints a picture of their biological differences. Moreover, we have found that variation in expression profiles can highlight unrecognized similarities and differences among tumors, and can provide a basis for systematic clustering of subsets of tumors. We therefore believe that underlying the apparent heterogeneity among cancers that we currently call by the same name, there may be a systematic taxonomy that is not readily apparent from histology or the small set of markers usually used to define subgroups of tumors. We propose to characterize the molecular variations among cancers of the breast, prostate, brain, and liver, by systematically and quantitatively measuring variation in transcript abundance for at least 20,000 different genes, in several hundred independent tumor samples from each of these tumor types. We will use multivariate clustering methods to search for ways to group tumors into clusters that are internally coherent in their expression patterns and thus, we hope, in their clinical behavior. Most of the tumor samples that are now available for the large retrospective studies that will be required to test the clinical utility of the new taxonomic groups we define are not suitable for analysis of gene expression at the RNA level. They are, however, well suited to immunohistochemical characterization. To make the transition from exploration and discovery of the molecular variation in cancer, to testing its connection to clinical behavior, we therefore propose to identify a large set of genes whose expression pattern varies most, and most independently, among the tumors we study, and raise antibodies against the predicted protein products. These antibodies will be used for immunohistochemistry, to characterize the variation in expression of the cultured cell lines. These antibody reagents will then be used for retrospective studies aimed at classifying tumors for which the natural history and treatment response is already known, to determine whether a new cancer taxonomy based on gene expression patterns can successfully order these cancers into groups with distinctive and consistent natural histories and patterns of response to treatment. These antibodies will aid investigations of the molecular pathogenesis of cancer. Some of them may provide a basis for non-invasive screens for early detection of cancers, and others could eventually even be used therapeutically.