Ensembl imports variations including Single Nucleotide Polymorphisms (SNPs) and insertion-deletion mutations (Indels) and their flanking sequence from various sources. These sequences are aligned to the reference sequence. Variation positions are calculated in this way along with any effects on transcripts in the area. The majority of variations are obtained from NCBI dbSNP. For human, other sources include Affy GeneChip Arrays, The European Genome-phenome Archive (EGA), and whole genome alignments of individual sequences from Venter, Watson and Celera individuals to the reference sequence. Sources for other species include Sanger resequencing projects for mouse, and alignments of sequences from the STAR consortium for rat. Ancestral alleles from dbSNP were determined through a comparison study of human and chimp DNA. (See reference).

Future directions: Ensembl will be incorporating data from the 1000 Genomes Project including structural variation data once this has been submitted to dbSNP. Ensembl is also working towards increasing the phenotype data resources and their relationship to genotypes.

Variations and sources can be viewed in the browser through pages such as:

  • Gene: Variation table and Variation image (for all variations in a gene)
  • Transcript: Population comparison and Comparison image (for all variations in a transcript across different sequences)
  • Location: Region in Detail (Variations can be drawn using “Configure this page” at the left. The menu allows display of information in Ensembl databases along with external sources in DAS format such as DGV loci.)

Clicking on any variation on an Ensembl page will open a Variation tab with information about the flanking sequence and source for the selected variation. Links to linkage disequilibrium (LD) plots, phenotype information (for human) from EGA and NHGRI and Ensembl genes and transcripts that include the variation can be found at the left of this tab. You may also view multiple genome alignments of various species, highlighting the variation. Ancestral sequences are included in this display.

Variation information can also be accessed using BioMart (gene or variation database), and the Perl API (variation databases).


