In recognition of its strong community support, we will refer to chimpanzee chromosomes using the orthologous numbering nomenclature proposed by ref.
We sequenced the genome of a single male chimpanzee Clint; Yerkes pedigree number C; Supplementary Table S1 , a captive-born descendant of chimpanzees from the West Africa subspecies Pan troglodytes verus , using a whole-genome shotgun WGS approach 19 , The assembly represents a consensus of two haplotypes, with one allele from each heterozygous position arbitrarily represented in the sequence.
Assessment of quality and coverage. Nucleotide-level accuracy is high by several measures. Comparison of the WGS sequence to 1. Comparison of protein-coding regions aligned between the WGS sequence, the recently published sequence of chimpanzee chromosome 21 ref. Structural accuracy is also high based on comparison with finished BACs from the primary donor and other chimpanzees, although the relatively low level of sequence redundancy limits local contiguity.
No misoriented contigs were found. The most problematic regions are those containing recent segmental duplications. These results are consistent with the described limitations of current WGS assembly for regions of segmental duplication Detailed analysis of these rapidly changing regions of the genome is being performed with more directed approaches The draft sequence of the chimpanzee genome also facilitates genome-wide studies of genetic diversity among chimpanzees, extending recent work 28 , 29 , 30 , We sequenced and analysed sequence reads from the primary donor, four other West African and three central African chimpanzees Pan troglodytes troglodytes to discover polymorphic positions within and between these individuals Supplementary Table S A total of 1.
Heterozygosity rates were estimated to be 9. The diversity in West African chimpanzees is similar to that seen for human populations 32 , whereas the level for central African chimpanzees is roughly twice as high.
The observed heterozygosity in Clint is broadly consistent with West African origin, although there are a small number of regions of distinctly higher heterozygosity. These may reflect a small amount of central African ancestry, but more likely reflect undetected regions of segmental duplications present only in chimpanzees. We set out to study the mutational events that have shaped the human and chimpanzee genomes since their last common ancestor.
We explored changes at the level of single nucleotides, small insertions and deletions, interspersed repeats and chromosomal rearrangements. The analysis is nearly definitive for the smallest changes, but is more limited for larger changes, particularly lineage-specific segmental duplications, owing to the draft nature of the genome sequence.
Genome-wide rates. We calculate the genome-wide nucleotide divergence between human and chimpanzee to be 1. The differences between one copy of the human genome and one copy of the chimpanzee genome include both the sites of fixed divergence between the species and some polymorphic sites within each species.
Nucleotide divergence rates are not constant across the genome, as has been seen in comparisons of the human and murid genomes 16 , 17 , 24 , 35 , The average divergence in 1-Mb segments fluctuates with a standard deviation of 0. The edges of the box correspond to quartiles; the notches to the standard error of the median; and the vertical bars to the range. The X and Y chromosomes are clear outliers, but there is also high local variation within each of the autosomes. Regional variation in divergence could reflect local variation in either mutation rate or other evolutionary forces.
Among the latter, one important force is genetic drift, which can cause substantial differences in divergence time across loci when comparing closely related species, as the divergence time for orthologues is the sum of two terms: t 1 , the time since speciation, and t 2 , the coalescence time for orthologues within the common ancestral population Other potential evolutionary forces are positive or negative selection. Although it is more difficult to quantify the expected contributions of selection in the ancestral population 41 , 42 , 43 , it is clear that the effects would have to be very strong to explain the large-scale variation observed across mammalian genomes 16 , There is tentative evidence from in-depth analysis of divergence and diversity that natural selection is not the major contributor to the large-scale patterns of genetic variability in humans 45 , 46 , For these reasons, we suggest that the large-scale variation in the human—chimpanzee divergence rate primarily reflects regional variation in mutation rate.
Chromosomal variation in divergence rate. Variation in divergence rate is evident even at the level of whole chromosomes Fig. The most striking outliers are the sex chromosomes, with a mean divergence of 1. The likely explanation is a higher mutation rate in the male compared with female germ line The higher mutation rate in the male germ line is generally attributed to the 5—6-fold higher number of cell divisions undergone by male germ cells We reasoned that this would affect mutations resulting from DNA replication errors the rate should scale with the number of cell divisions but not mutations resulting from DNA damage such as deamination of methyl CpG to TpG the rate should scale with time.
This intermediate value is a composite of the rates of CpG loss and gain, and is consistent with roughly equal rates of CpG to TpG transitions in the male and female germ line 51 , Significant variation in divergence rates is also seen among autosomes Fig. Additional factors thus influence the rate of divergence between chimpanzee and human chromosomes.
These factors are likely to act at length scales significantly shorter than a chromosome, because the standard deviation across autosomes 0. We therefore sought to understand local factors that contribute to variation in divergence rate. Contribution of CpG dinucleotides. Sites containing CpG dinucleotides in either species show a substantially elevated divergence rate of The former process is known to occur at a rapid rate per base due to frequent methylation of cytosines in a CpG context and their frequent deamination 53 , 54 , whereas the latter process probably proceeds at a rate more typical of other nucleotide substitutions.
Because of the high rate of CpG substitutions, regional divergence rates would be expected to correlate with regional CpG density. S2 , suggesting that higher-order effects modulate the rates of two very different mutation processes see also ref. Increased divergence in distal regions. The most striking regional pattern is a consistent increase in divergence towards the ends of most chromosomes Fig.
The phenomenon correlates better with physical distance than relative position along the chromosomes and may partially explain why smaller chromosomes tend to have higher divergence Supplementary Fig. S3 ; see also ref. These observations suggest that large-scale chromosomal structure, directly or indirectly, influences regional divergence patterns.
Correlation with chromosome banding. Elucidation of the relative contributions of these and other mechanisms will be important for formulating accurate models for population genetics, natural selection, divergence times and the evolution of genome-wide sequence composition In regions with recombination rates less than 0.
In regions with recombination rates greater than 2. Correlation with regional variation in the murid genome. Given that sequence divergence shows regional variation in both hominids human—chimpanzee and murids mouse—rat , we asked whether the regional rates are positively correlated between orthologous regions.
Comparative analysis of the human and murid genomes has suggested such a correlation 58 , 59 , 60 , but the chimpanzee sequence provides a direct opportunity to compare independent evolutionary processes between two mammalian clades.
We compared the local divergence rates in hominids and murids across major orthologous segments in the respective genomes Fig. The same general effect is observed albeit less pronounced if CpG dinucleotides are excluded Supplementary Fig. Segments that are distal in murids do not show elevated divergence rates, which is consistent with this model, because the recombination rates of distal regions are not as elevated in mouse and rat Taken together, these observations suggest that sequence divergence rate is influenced by both conserved factors stable across mammalian evolution and lineage-specific factors such as proximity to the telomere or recombination rate, which may change with chromosomal rearrangements.
We next studied the indel events that have occurred in the human and chimpanzee lineages by aligning the genome sequences to identify length differences.
We will refer below to all events as insertions relative to the other genome, although they may represent insertions or deletions relative to the genome of the common ancestor. Nearly all of the human insertions are completely covered, whereas only half of the chimpanzee insertions are completely covered. Sequences present in chimpanzee but not in human blue or present in human but not in chimpanzee red are shown.
The prominent spike around nucleotides corresponds to SINE insertion events. These cases include 34 regions that involve exons from known genes, which are discussed in a subsequent section. Although we have no direct measure of large insertions in the chimpanzee genome, it appears likely that the situation is similar.
We next used the catalogue of lineage-specific transposable element copies to compare the activity of transposons in the human and chimpanzee lineages Table 2. Endogenous retroviruses. HERV-K was found to be active in both lineages, with at least 73 human-specific insertions 7 full length and 66 solo long terminal repeats LTRs and at least 45 chimpanzee-specific insertions 1 full length and 44 solo LTRs.
Against this background, it was surprising to find that the chimpanzee genome has two active retroviral elements PtERV1 and PtERV2 that are unlike any older elements in either genome; these must have been introduced by infection of the chimpanzee germ line.
The larger family PtERV1 is more homogeneous and has over copies. PtERV1-like elements are present in the rhesus monkey, olive baboon and African great apes but not in human, orang-utan or gibbon, suggesting separate germline invasions in these species Higher Alu activity in humans.
Most chimpanzee-specific elements belong to a subfamily AluYc1 that is very similar to the source gene in the common ancestor. By contrast, most human-specific Alu elements belong to two new subfamilies AluYa5 and AluYb8 that have evolved since the chimpanzee—human divergence and differ substantially from the ancestral source gene It seems likely that the resurgence of Alu elements in humans is due to these potent new source genes.
However, based on an examination of available finished sequence, the baboon shows a 1. Possible explanations include: gene conversion by nearby older elements; processed pseudogenes arising from a spurious transcription of an older element; precise excision from the chimpanzee genome; or high local mutation rate.
In any case, the presence of such anomalies suggests that caution is warranted in the use of single-repeat elements as homoplasy-free phylogenetic markers. The latter distribution is consistent with the fact that Alu retrotransposition is mediated by L1 ref.
Murid genomes revealed no change in SINE distribution with age With the availability of the chimpanzee genome, it is possible to classify the youngest Alu copies more accurately and thus begin to distinguish these possibilities. The figure is similar to figure 23 of ref.
Equal activity of L1 in both species. In principle, incomplete reverse transcription could result in insertions of the flanking sequence only without any L1 sequence , mobilizing gene elements such as exons, but we found no evidence of this.
Retrotransposed gene copies. The L1 machinery also mediates retrotransposition of host messenger RNAs, resulting in many intronless processed pseudogenes in the human genome 75 , 76 , We identified lineage-specific retrotransposed gene copies in human and in chimpanzee Supplementary Table S As expected 78 , ribosomal protein genes constitute the largest class in both species.
The second largest class in chimpanzee corresponds to zinc finger C2H2 genes, which are not a major class in the human genome. The third most active element since speciation has been SVA, which created about 1, copies in each lineage.
This element is of particular interest because each copy carries a sequence that satisfies the definition of a CpG island 81 and contains potential transcription factor binding sites; the dispersion of 1, SVA copies could therefore be a source of regulatory differences between chimpanzee and human Supplementary Table S At least three human genes contain SVA insertions near their promoters Supplementary Table S21 , one of which has been found to be differentially expressed between the two species 82 , 83 , but additional investigations will be required to determine whether the SVA insertion directly caused this difference.
Homologous recombination between interspersed repeats. Human—chimpanzee comparison also makes it possible to study homologous recombination between nearby repeat elements as a source of genomic deletions. Similarly, we found 26 and 48 instances involving adjacent L1 copies and 8 and 22 instances involving retroviral LTRs in human and chimpanzee, respectively. None of the repeat-mediated deletions removed an orthologous exon of a known human gene in chimpanzee.
The genome comparison allows one to estimate the dependency of homologous recombination on divergence and distance. The first three points magenta involve recombination between left or right arms of one Alu inserted into another. The high number of occurrences at a distance of — nucleotides is due to the preference of integration in the A-rich tail; exclusion of this point does not change the parameters of the equation.
Finally, we examined the chimpanzee genome sequence for information about large-scale genomic alterations. Cytogenetic studies have shown that human and chimpanzee chromosomes differ by one chromosomal fusion, at least nine pericentric inversions, and in the content of constitutive heterochromatin Human chromosome 2 resulted from a fusion of two ancestral chromosomes that remained separate in the chimpanzee lineage chromosomes 2A and 2B in the revised nomenclature 18 , formerly chimpanzee chromosomes 12 and 13 ; the precise fusion point has been mapped and its duplication structure described in detail 85 , In accord with this, alignment of the human and chimpanzee genome sequences shows a break in continuity at this point.
We searched the chimpanzee genome sequence for the precise locations of the 18 breakpoints corresponding to the 9 pericentric inversions Supplementary Table S By mapping paired-end sequences from chimpanzee large insert clones to the human genome, we were able to identify 13 of the breakpoints within the assembly from discordant end alignments.
The positions of five breakpoints on chromosomes 4, 5 and 12 were tested by fluorescence in situ hybridization FISH analysis and all were confirmed. Also, the positions of three previously mapped inversion breakpoints on chromosomes 15 and 18 matched closely those found in the assembly 87 , The paired-end analysis works well in regions of unique sequence, which constitute the bulk of the genome, but is less effective in regions of recent duplication owing to ambiguities in mapping of the paired-end sequences.
However, both smaller inversions and more recent segmental duplications will require further investigations. We next sought to use the chimpanzee sequence to study the role of natural selection in the evolution of human protein-coding genes. Genome-wide comparisons can shed light on many central issues, including: the magnitude of positive and negative selection; the variation in selection across different lineages, chromosomes, gene families and individual genes; and the complete loss of genes within a lineage.
We began by identifying a set of 13, pairs of human and chimpanzee genes with unambiguous orthology for which it was possible to generate high-quality sequence alignments covering virtually the entire coding region ref. The list contains a large fraction of the entire complement of human genes, although it under-represents gene families that have undergone recent local expansion such as olfactory receptors and immunoglobulins.
To facilitate comparison with the murid lineage, we also compiled a set of 7, human, chimpanzee, mouse and rat genes with unambiguous orthology and high-quality sequence alignments Supplementary Table S To assess the rate of evolution for each gene, we estimated K A , the number of coding base substitutions that result in amino acid change as a fraction of all such possible sites the non-synonymous substitution rate.
Because the background mutation rate varies across the genome, it is crucial to normalize K A for comparisons between genes. Classically, the background rate is estimated by K S , the synonymous substitution rate coding base substitutions that, because of codon redundancy, do not result in amino acid change. K A and K S were also estimated for each lineage separately using mouse and rat as outgroups Fig.
The branch lengths are proportional to the absolute rates of amino acid divergence. Evolutionary constraint on amino acid sites within the hominid lineage. The median number of non-synonymous and synonymous substitutions per gene are two and three, respectively.
The close similarity of human and chimpanzee genes necessarily limits the ability to make strong inferences about individual genes, but there is abundant data to study important sets of genes. The value is much lower than some recent estimates based on limited sequence data ranging as high as 0.
Because synonymous mutations are not entirely neutral see below , the actual proportion of amino acid alterations with deleterious consequences may be higher. Evolutionary constraint on synonymous sites within hominid lineage.
We next explored the evolutionary constraints on synonymous sites, specifically fourfold degenerate sites. Because such sites have no effect on the encoded protein, they are often considered to be selectively neutral in mammals. We re-examined this assumption by comparing the divergence at fourfold degenerate sites with the divergence at nearby intronic sites.
This result resolves recent conflicting reports based on limited data sets 45 , 89 by showing that such sites are indeed under constraint. The constraint does not seem to result from selection on the usage of preferred codons, which has been detected in lower organisms 90 such as bacteria 91 , yeast 92 and flies This pattern strongly suggests that the action of purifying selection at synonymous sites is direct rather than indirect, suggesting that other signals, for example those involved in splice site selection, may be embedded in the coding sequence and therefore constrain synonymous sites.
Mean divergence around exon boundaries at non-CpG, exonic, fourfold degenerate sites and intronic sites, relative to the closest mRNA splice junction. Comparison with murids. This would predict that genes would be under stronger purifying selection in murids than hominids, owing to their presumed larger population size. This is slightly lower than the value of 0.
Excess amino acid divergence may be explained by either increased adaptive evolution or relaxation of evolutionary constraints. As shown in the next section, the latter seems to be the principal explanation. Relaxed constraints in human evolution.
Because alleles under positive selection spread rapidly through a population, they will be found less frequently as common human polymorphisms than as human—chimpanzee differences 8. This would imply a huge quantitative role for positive selection in human evolution. With the availability of extensive data for both human polymorphism and human—chimpanzee divergence, we repeated this analysis using the same set of genes for both estimates.
Although some of the amino acid substitutions in human and chimpanzee evolution must surely reflect positive selection, the results indicate that the proportion of changes fixed by positive selection seems to be much lower than the previous estimate 8. Because the previous results involved comparison to Old World monkeys, it is possible that they reflect strong positive selection earlier in primate evolution; however, we suspect that they reflect the fact that relatively few genes were studied and that different genes were used to study polymorphism and divergence.
Relaxed negative selection pressures thus primarily explain the excess amino acid divergence in hominid genes relative to murids. We next sought to study variation in the evolutionary rate of genes within the hominid lineage by searching for unusually high or low levels of constraint for genes and sets of genes.
We searched for individual genes that have accumulated amino acid substitutions faster than expected given the neutral substitution rate; we considered these genes as potentially being under strong positive selection.
A total of of the 13, human—chimpanzee orthologues 4. Nonetheless, this set of genes may be enriched for genes that are under positive selection. The most extreme outliers include glycophorin C, which mediates one of the Plasmodium falciparum invasion pathways in human erythrocytes ; granulysin, which mediates antimicrobial activity against intracellular pathogens such as Mycobacterium tuberculosis ; as well as genes that have previously been shown to be undergoing adaptive evolution, such as the protamines and semenogelins involved in reproduction and the Mas-related gene family involved in nociception With similar follow-up studies on candidates from this list, one may be able to draw conclusions about positive selection on other individual genes.
In subsequent sections, we examine the rate of divergence for sets of related genes with the aim of detecting subtler signals of accelerated evolution. We explored how the rate of evolution varies regionally across the genome. Several studies of mammalian gene evolution have noted that the rate of amino acid substitution shows local clustering, with proteins encoded by nearby genes evolving at correlated rates 16 , , , Variation across chromosomes. A subsequent study of a collection of chimpanzee ESTs gave contradictory results , With our larger data set, we re-examined this issue and found no evidence of accelerated evolution on chromosomes with major rearrangements, even if we considered each rearrangement separately Supplementary Table S The higher mean seems to reflect a skewed distribution at both high and low values, with the median value 0.
The excess of low values may reflect greater purifying selection at some genes, owing to the hemizygosity of chromosome X in males. The excess of high values may reflect increased adaptive selection also resulting from hemizygosity, if a considerable proportion of advantageous alleles are recessive Variation in local gene clusters. We next searched for genomic neighbourhoods with an unusually high density of rapidly evolving genes.
A total of 16 such neighbourhoods were found, which greatly exceeds random expectation Table 4. Repeating the analysis with larger windows 25, 50 and orthologues did not identify additional rapidly diverging regions.
In nearly all cases, the regions contain local clusters of phylogenetically and functionally related genes. The rapid diversification of gene families, postulated by ref. Most of the clusters are associated with functional categories such as host defence and chemosensation see below. Examples include the epidermal differentiation complex encoding proteins that help form the cornified layer of the skin barrier Supplementary Fig. S8 , the WAP-domain cluster encoding secreted protease inhibitors with antibacterial activity, and the Siglec cluster encoding CD33 -related genes.
Rapid evolution in these clusters does not seem to be unique to either human or chimpanzee , We next studied variation in the evolutionary rate of functional categories of genes, based on the Gene Ontology GO classification Rapidly and slowly evolving categories within the hominid lineage.
We started by searching for sets of functionally related genes with exceptionally high or low constraint in humans and chimpanzees.
The rapidly evolving categories within the hominid lineage are primarily related to immunity and host defence, reproduction, and olfaction, which are the same categories known to be undergoing rapid evolution within the broader mammalian lineage, as well as more distantly related species 15 , 16 , Hominids thus seem to be typical of mammals in this respect but see below.
These include a wide range of processes including intracellular signalling, metabolism, neurogenesis and synaptic transmission, which are evidently under stronger-than-average purifying selection.
More generally, genes expressed in the brain show significantly stronger average constraint than genes expressed in other tissues Differences between hominid and murid lineages. Having found gene categories that show substantial variation in absolute evolutionary rate within hominids, we next examined variation in relative rates between murids and hominids. Owing to the hierarchical nature of GO, the categories do not all represent independent data points. A non-redundant list of significant categories is provided in Table 8 and a complete list in Supplementary Table S These are dominated by functions and processes related to host defence, such as immune response and lymphocyte activation.
Examples include genes encoding interleukins and various T-cell surface antigens Cd4 , Cd8 , Cd Combined with the recent observation that genes involved in host defence have undergone gene family expansion in murids 16 , 17 , this suggests that the immune system has undergone extensive lineage-specific innovation in murids. Additional categories that also show relative acceleration in murids include chromatin-associated proteins and proteins involved in DNA repair.
Notably, some outliers include genes with brain-related functions, compatible with a recent finding Potential positive selection on spermatogenesis genes in the hominids was also recently noted However, as above, it is possible that these categories could have more sites for slightly deleterious mutations and thus be more affected by population size differences.
Sequence information from more species and from individuals within species will be necessary to distinguish between the possible explanations. Differences between the human and chimpanzee lineage. One of the most interesting questions is perhaps whether certain categories have undergone accelerated evolution in humans relative to chimpanzees, because such genes might underlie unique aspects of human evolution.
As was done for hominids and murids above, we compared non-synonymous divergence for each category to search for relative acceleration in either lineage Fig.
Genes with accelerated divergence in human include homeotic, forkhead and other transcription factors that have key roles in early development.
However, given the small number of changes involved, additional data will be required to confirm this trend. There was no excess of accelerated categories on the chimpanzee lineage. The variance of these estimates is larger than that seen in the hominid—murid comparison owing to the small number of lineage-specific substitutions. Owing to the hierarchical nature of the GO ontology, the categories do not all represent independent data points.
A complete list of categories is provided in Supplementary Table S We also compared human genes with and without disease associations, including mental retardation, for differences in mutation rate when compared to chimpanzee. We thus find minimal evidence of acceleration unique to either the human or chimpanzee lineage across broad functional categories. This is not simply due to general lack of power resulting from the small number of changes since the divergence of human and chimpanzee, because one can detect acceleration of categories in either hominid relative to either murid.
But the outliers are largely the same for both human and chimpanzee, indicating that the fraction of amino acid mutations that have contributed to human- and chimpanzee-specific patterns of evolution must be small relative to the fraction that have contributed to a common hominid and, to a large extent, mammalian pattern of evolution.
It was recently reported 10 that several functional categories are enriched for genes with evidence of positive selection in the human lineage or the chimpanzee lineage, and that these categories are largely different between the two lineages.
These results and ours differ in ways that will require further investigation. With the potential exception of some developmental regulators, the categories that ref. This suggests that positive selection and relaxation of constraints may be correlated, or alternatively, that the results of ref.
Data from additional primates, as well as advances in analytical methods, will be necessary to distinguish between these alternatives.
At present, strong evidence of positive selection unique to the human lineage is thus limited to a handful of genes Our analysis above largely omitted genes belonging to large gene families, because gene family expansion makes it difficult to define orthologues across hominids and murids. One of the largest such families, the olfactory receptors, is known to be undergoing rapid divergence in primates. Directed study of these genes in the draft assembly has suggested that more than functional human olfactory receptors are likely to be under no evolutionary constraint Our analysis also omitted the majority of very recently duplicated genes owing to their lower coverage in the current chimpanzee assembly.
However, recent human-specific duplications can be readily identified from the finished human genome sequence, and have previously been shown to be highly enriched for the same categories found to have high absolute rates of evolution in orthologues here; that is, olfaction, immunity and reproduction Whereas most genes have undergone only subtle substitutions in their amino acid sequence, a few dozen have suffered more marked changes.
We found a total of 53 known or predicted human genes that are either deleted entirely 36 or partially 17 in chimpanzee Supplementary Table S We have so far tested and confirmed 15 of these cases by polymerase chain reaction PCR or Southern blotting.
Some genes may have been missed in this count owing to limitations of the draft genome sequence. In addition, some genes may have suffered chain termination mutations or altered reading frames in chimpanzee, but accurate identification of these will require higher-quality sequence. The sensitivity of the reciprocal analysis of genes disrupted in human is currently limited by the small number of independently predicted gene models for the chimpanzee. Some of the gene disruptions may be related to interesting biological differences between the species, as discussed below.
Given the substantial number of neutral mutations, only a small subset of the observed gene differences is likely to be responsible for the key phenotypic changes in morphology, physiology and behavioural complexity between humans and chimpanzees. Determining which differences are in this evolutionarily important subset and inferring their functional consequences will require additional types of evidence, including information from clinical observations and model systems We describe some novel examples of genetic changes for which plausible functional or physiological consequences can be suggested.
Mouse and human are known to differ with respect to an important mediator of apoptosis, caspase refs — The protein triggers apoptosis in response to perturbed calcium homeostasis in mice, but humans seem to lack this activity owing to several mutations in the orthologous gene that together affect the protein produced by all known splice forms; the mutations include a premature stop codon and a disruption of the SHG box required for enzymatic activity of caspases.
By contrast, the chimpanzee gene encodes an intact open reading frame and SHG box, indicating that the functional loss occurred in the human lineage. Intriguingly, loss-of-function mutations in mice confer increased resistance to amyloid-induced neuronal apoptosis without causing obvious developmental or behavioural defects The loss of function in humans may contribute to the human-specific pathology of Alzheimer's disease, which involves amyloid-induced neurotoxicity and deranged calcium homeostasis.
Inflammatory response. Human and chimpanzee show a notable difference with respect to important mediators of immune and inflammatory responses. Parasite resistance. The APOL1 protein is associated with the high-density lipoprotein fraction in serum and has recently been proposed to be the lytic factor responsible for resistance to certain subspecies of Trypanosoma brucei , the parasite that causes human sleeping sickness and the veterinary disease nagana The loss of the APOL1 gene in chimpanzees could thus explain the observation that human, gorilla and baboon possess the trypanosome lytic factor, whereas the chimpanzee does not Sialic acid biology related proteins.
Sialic acids are cell-surface sugars that mediate many biological functions Of 54 genes involved in sialic acid biology, 47 were suitable for analysis. We confirmed and extended findings on several that have undergone human-specific changes, including disruptions, deletions and domain-specific functional changes , , Lineage-specific changes were found in a complement factor H HF1 sialic acid binding domain associated with human disease Human SIGLEC11 has undergone gene conversion with a nearby pseudogene, correlating with acquisition of human-specific brain expression and altered binding properties We next sought to identify putative functional differences between the species by searching for instances in which a human disease-causing allele appears to be the wild-type allele in the chimpanzee.
Starting from 12, catalogued disease variants in 1, human genes, we identified 16 cases in which the altered sequence in a disease allele matched the chimpanzee sequence, and had plausible support in the literature Table 7 ; see also Supplementary Table S Upon re-sequencing in seven chimpanzees, 15 cases were confirmed homozygous in all individuals, whereas one PON1 IV appears to be a shared polymorphism Supplementary Table S Six cases represent de novo human mutations associated with simple mendelian disorders.
Similar cases have also been found in comparisons of more distantly related mammals , as well as between insects , and have been interpreted as a consequence of a relatively high rate of compensatory mutations. If compensatory mutations are more likely to be fixed by positive selection than by neutral drift , then the variants identified here might point towards adaptive differences between humans and chimpanzees.
The remaining ten cases represent common human polymorphisms that have been reported to be associated with complex traits, including coronary artery disease and diabetes mellitus. In all of these cases we confirmed that the disease-associated allele in humans is indeed the ancestral allele by showing that it is carried not only by chimpanzee but also by outgroups such as the macaque.
These ancestral alleles may thus have become human-specific risk factors due to changes in human physiology or environment, and the polymorphisms may represent ongoing adaptations.
The current results must be interpreted with caution, because few complex disease associations have been firmly established. The fact that the human disease allele is the wild-type allele in chimpanzee may actually indicate that some of the putative associations are spurious and not causal. However, this approach can be expected to become increasingly fruitful as the quality and completeness of the disease mutation databases improve.
The chimpanzee has a special role in informing studies of human population genetics, a field that is undergoing rapid expansion and acquiring new relevance to human medical genetics The chimpanzee sequence allows recognition of those human alleles that represent the ancestral state and the derived state. It also allows estimates of local mutation rates, which serve as an important baseline in searching for signs of natural selection.
For the remaining cases, no assignment could be made because of the following: the orthologous chimpanzee base differed from both human alleles 1.
The first two cases arise presumably because a second mutation occurred in the chimpanzee lineage. It should be possible to resolve most of these cases by examining a close outgroup such as gorilla or orang-utan. Mutations in the chimpanzee may also lead to the erroneous assignment of human alleles as derived alleles. The estimated error rate for typical SNPs is 0.
For these, a non-negligible fraction may have arisen by two independent deamination events within an ancestral CpG dinucleotide, which are well-known mutational hotspots 51 also see above.
As expected, ancestral alleles tend to have much higher frequencies than derived alleles Supplementary Fig. Nonetheless, a significant proportion of derived alleles have high frequencies: 9. An elegant result in population genetics states that, for a randomly interbreeding population of constant size, the probability that an allele is ancestral is equal to its frequency We explored the extent to which this simple theoretical expectation fits the human population.
Note that because each variant yields a derived and an ancestral allele, the data are necessarily symmetrical about 0. The data lie near the predicted line, but the observed slope 0. The most likely explanation is the presence of bottlenecks during human history, which tend to flatten the distribution of allele frequencies. This suggests that measurements of the slope in different human groups may shed light on population-specific bottlenecks.
The pattern of human genetic variation holds substantial information about selection events that have shaped our species. Notably, the chimpanzee genome provides crucial baseline information required for accurate assessment of both signatures. The size of the interval affected by a selective sweep is expected to scale roughly with s , the selective advantage due to the mutation. We began by identifying regions in which the observed human diversity rate was much lower than the expectation based on the observed divergence rate with chimpanzee.
The human diversity rate was measured as the number of occurrences from a database of 1. The comparison with the chimpanzee eliminates regions in which low diversity simply reflects a low mutation rate in the region.
Six genomic regions stand out as clear outliers that show significantly reduced diversity relative to divergence Table 8 ; see also Supplementary Fig. Within each region, we focused on the 1-Mb interval with the greatest discrepancy between diversity and divergence and compared it to 1-Mb regions throughout the genome. The regions differ notably with respect to gene content, ranging from one containing 57 annotated genes chromosome 22 to another with no annotated genes whatsoever chromosome 4.
We have no evidence to implicate any individual functional element as a target of recent selection at this point, but the regions contain a number of interesting candidates for follow-up studies. Intriguingly, the chromosome 4 gene desert, which flanks a proto-cadherin gene and is conserved across vertebrates 15 , has been implicated in two independent studies as being associated with obesity , On the other hand, the number of genetic differences between a human and a chimp is about 10 times more than between any two humans.
The researchers discovered that a few classes of genes are changing unusually quickly in both humans and chimpanzees compared with other mammals.
These classes include genes involved in perception of sound, transmission of nerve signals, production of sperm and cellular transport of electrically charged molecules called ions. Researchers suspect the rapid evolution of these genes may have contributed to the special characteristics of primates, but further studies are needed to explore the possibilities. The genomic analyses also showed that humans and chimps appear to have accumulated more potentially deleterious mutations in their genomes over the course of evolution than have mice, rats and other rodents.
While such mutations can cause diseases that may erode a species' overall fitness, they may have also made primates more adaptable to rapid environmental changes and enabled them to achieve unique evolutionary adaptations, researchers said. Despite the many similarities found between human and chimp genomes, the researchers emphasized that important differences exist between the two species.
About 35 million DNA base pairs differ between the shared portions of the two genomes, each of which, like most mammalian genomes, contains about 3 billion base pairs. In addition, there are another 5 million sites that differ because of an insertion or deletion in one of the lineages, along with a much smaller number of chromosomal rearrangements. Most of these differences lie in what is believed to be DNA of little or no function. However, as many as 3 million of the differences may lie in crucial protein-coding genes or other functional areas of the genome.
The genetic changes that distinguish humans from chimps will likely be a very small fraction of this set," said the study's lead author, Tarjei S. Among the genetic changes that researchers will be looking for are those that may be related to the human-specific features of walking upright on two feet, a greatly enlarged brain and complex language skills. Although the statistical signals are relatively weak, a few classes of genes appear to be evolving more rapidly in humans than in chimps.
The single strongest outlier involves genes that code for transcription factors, which are molecules that regulate the activity of other genes and that play key roles in embryonic development. In keeping with previous studies comparing much smaller portions of the chimp and human genomes, the new comparison shows incredible similarity between the genomes.
What is more, only a handful of genes present in humans are absent or partially deleted in chimps. But the degree of genome similarity alone is far from the whole story. For example, the mouse species Mus musculus and Mus spretus have genomes that differ from each other to a similar degree and yet they appear far more similar than chimps and humans.
Domestic dogs, however, vary wildly in appearance as a result of selective breeding and yet their genome sequences are So most of the differences between chimp and human genomes will turn out to be neither beneficial nor detrimental, in evolutionary terms. The real challenge then will be finding the changes that played a major role in the evolution of chimps and humans since the two lineages split, 5 to 8 million years ago.
Nothing obvious has leapt out of the initial analysis.
0コメント