Method used to map the trait linkage analysis Open in a separate window Several factors complicate linkage analysis in common, complex diseases such as prostate cancer. Since prostate cancer is a common disease, families may include members who have developed sporadic forms of the disease. These subjects, termed phenocopies, can confound linkage analysis. Also, unlike BRCA-associated breast cancer, which presents relatively early in life 40 , or familial adenomatous polyposis, which has a distinctive clinical presentation 41 , there is little to clinically or pathologically distinguish prostate cancer densely clustered in families from sporadic disease.
Also, since most sporadic cancers occur relatively late in life, it is difficult to obtain DNA samples and clinical data from more than one generation. Investigation into cancer risk clearly indicates that Mendelian segregation of this phenotype is the exception rather than the rule. The genetic risk in complex disease is comprised of multiple alleles, with no single allele being fully deterministic for driving tumorigenesis i. To identify alleles associated with complex phenotypes, focus shifted from highly penetrant alleles clustered within families to more common variants present in larger, unrelated populations Table 1.
Initial efforts to identify modestly penetrant alleles associated with cancer risk relied on resequencing candidate genes predicted to play a role in disease risk. Associations were sought by measuring differences in allele frequencies at polymorphisms between cases and controls.
While convincing findings have been reported for certain common malignancies, such as bladder cancer 42 , 43 , the candidate gene approach has yielded few associations robustly validated in independent cohorts. In prostate cancer for example, the gene for the androgen receptor warranted significant attention given its known role in prostate carcinogenesis. However, extensive annotation of variation across the gene in prostate cancer cases and matched controls yielded no inherited variants associated with risk A less biased approach was needed to identify the alleles associated with complex disease.
Genome-wide association studies GWAS scan the genome for polymorphisms, usually single nucleotide polymorphisms SNPs , which are associated with a trait of interest. GWAS compare allele frequencies among individuals with a phenotype of interest to frequencies among unaffected individuals. Over the past 10 years several advances made possible the implementation of GWAS: GWAS take an unbiased approach in the search for genetic polymorphisms associated with disease, evaluating a substantial portion of the variation across the genome.
While the International HapMap Project has catalogued over 10 million SNPs, it is not necessary to genotype and analyze all SNPs in order to achieve genome-wide coverage for common alleles. Nearby SNPs are co-inherited more often than would be expected by chance.
A single SNP can serve as proxy for much of the variation in the surrounding genetic region, and due to this linkage disequilibrium LD , the number of genotypes necessary to conduct a GWAS is greatly reduced. LD must be empirically determined and differs across ethnic groups. Nonetheless, testing up to a million independent SNPs raises important statistical considerations Due to the potential for a large number of false positives, strict statistical thresholds are necessary to identify true positives rather than associations observed merely by chance.
In order to achieve this statistical threshold, large datasets, comprised of thousands of cases and controls, are necessary. Since , over bona fide risk alleles have been discovered for dozens of cancers, including approximately 40 polymorphisms associated with prostate cancer risk 50 - 62 see http: An encouraging observation is the reproducibility in independent cohorts of most findings Odds ratios associated with risk alleles for common, polygenic diseases tend to be modest, generally less than 1.
The power to detect an effect of this size requires very large study populations. Assembling adequately sized cohorts can be extremely challenging. As larger cohorts are collected and GWAS combine data in meta-analyses, more trait-associated variants with smaller minor allele frequencies may emerge.
Despite the large number of cancer risk loci reported and validated to date, these variants only explain a fraction of the estimated heritability. Where is the rest of the genetic contribution to disease?
There are several explanations for this gap between what has been achieved by GWAS and what remains to be found. Very rare alleles associated with disease may have greater impact. Rather than odds ratios of 1. The Genomes Project, a cataloguing of human genetic variation based on whole genome sequencing presents the opportunity to explore this possibility given suitably large cohorts Another possibility is that genome-wide surveys of structural variants, such as copy number variation, will account for some of the heritability gap.
These variants are poorly represented in the arrays used for most GWAS. Finally, it is possible that gene-gene and gene-environment interactions play a significant role in inherited risk.
The complexities involved in the study of these factors are daunting, but strides are being made Certain trends have emerged in cancer-related GWAS. There are regions across the genome containing inherited variants for more than one disease.
One of these regions is chromosome 8q24, first identified in as a prostate cancer risk locus in both European and African American populations 51 , The region includes the well-known oncogene MYC.
Several other prostate cancer GWAS converged on 8q24, and, to date, a total of at least nine SNPs, all independently associated with prostate cancer risk, reside at 8q24 50 , 61 , Intriguingly, risk markers for breast, colon, bladder cancer and chronic lymphocytic leukemia have been discovered at this chromosomal locus. Similarly, chromosome 5p15 harbors multiple risk variants, including SNPs for prostate cancer, glioma, pancreatic cancer, bladder cancer, lung cancer, breast cancer, uterine cancer, melanoma, and basal cell carcinoma 72 - The region contains the gene TERT which is involved in telomerase activity.
Mutations in this gene have been implicated in bone marrow failure syndromes and hematologic malignancies 79 , Variants associated with prostate cancer in this region are also associated with type 2 diabetes. However the effects of the risk allele are in the opposite direction for the two phenotypes, raising interesting questions regarding the relationship between prostate tumorigenesis and metabolic processes Another trend in cancer GWAS is the differences in risk allele discovery across diseases.
GWAS in prostate cancer, for example, have yielded more associated variants compared to other common cancer such as lung cancer. There are several possible reasons for this. Due to its ubiquity and the relative good health of men with disease, large cohorts have been assembled more readily.
Also, prostate cancer has a stronger inherited component compared with other common cancers Prostate cancers may also be more homogenous than other cancers. For example, case series of lung cancer, for which fewer than 10 associated variants have been found, may include genetically distinct subtypes of disease, affecting the statistical power of finding an association.
Breast cancer GWAS results demonstrate certain polymorphisms that appear specific for estrogen receptor ER -positive and others for ER-negative disease GWAS data for populations other than those of European ancestry are generally lacking.
While many risk alleles replicate across ethnic groups, there may be cases where the genetic architecture of disease risk differs. This can have substantial implications in any personalized approach to patient care. Further work across multiple ethnic groups should be pursued in order to have a composite picture of disease risk. Fine mapping is a method used to home in on the allele or alleles truly responsible for a given phenotype. A strategy used to comprehensively interrogate a newly discovered disease risk locus begins by resequencing the region in a set of cases and controls to ascertain the full complement of germline variants in the population 74 , 83 , Each variant is then analyzed in a larger set of cases and control for association with the trait.
Statistical models are used to determine the allele or set of alleles that most exhaustively accounts for the association. The functional consequences of inheriting a risk allele are not readily apparent. Insight into the mechanisms underlying associations between risk loci and cancer will increase understanding of the genes and pathways mediating tumorigenesis.
Inherited variants can influence phenotype in several ways: Because a majority of cancer-related variants resides in non-coding regions, most experience to date comes from examining the role of risk SNPs in gene regulation. It is well established that certain germline variants, referred to as expression quantitative trait loci eQTLs , can affect transcription locally or at considerable genomic distances 88 - Interrogation of two cancer risk loci discovered by GWAS-8q24 and 10q illustrate the ways in which the mechanisms of inherited risk may be revealed.
Fine mapping across the risk locus demonstrated that rs is the variant most strongly associated with risk Decreased levels of PSP94 are associated with prostate cancer risk Electromobility shift assays and luciferase transfection studies showed that genotype at the locus influences MSMB activity 93 , Associations between genotype at rs and expression of nearby genes were measured in 84 human prostate tissue specimens The allele is also associated with decreased MSMB expression in urine, a proposed biomarker Notably, all 8q24 risk polymorphisms reside in intergenic, non-coding regions of the genome.
As with the 10q11 prostate cancer risk allele and the FGFR2 breast cancer allele, it was hypothesized that the 8q24 risk loci are eQTLs.
However, there does not appear to be an association between risk allele status and MYC expression Evidence has accumulated implicating 8q24 colon and prostate cancer risk alleles in the activity of genetic enhancers, elements capable of affecting expression of one or more genes from long-range 99 , Further evidence suggests that these enhancer elements are in long-range contact with MYC across hundreds of kb - These findings suggest involvement by MYC in prostate cancer risk and may provide a paradigm for investigating other risk regions.
The discoveries at 8q24 demonstrate the potential for GWAS results to elucidate the underpinnings of inherited risk. The variants also can lend insight into cancer biology. However, it is less clear whether the newly discovered risk marker have clinical utility. Clinical utility is a measure of the potential benefits of a test relative to its risks and costs.
A biomarker for cancer risk should be affordable, accurate, and easily interpretable by health care providers and patients As predictors of risk, germline genetic markers have a natural advantage over many current biomarkers because they are static; they are ever-present and do not fluctuate with time or clinical condition.
For example, markers such as PSA only reach clinical attention when prostate cancer has presumably already developed, whereas inherited risk SNPs are testable at any time prior to the presence of disease. These considerations must be balanced, however, against the quantity of information gained by the addition of genetic risk factors.
Much work in this area has involved prostate cancer. Prostate cancer is the second leading cause of cancer-related death among men in the U. PSA is widely used as a biomarker for disease, but is imperfect It is not adequately specific for most men with abnormal levels and does a poor job of distinguishing aggressive from indolent disease. Other variables, such as family history and ethnicity are predictive, but not clinically useful.
Zheng et al demonstrated that risk of prostate cancer correlates with increasing number of risk alleles