Archive Ensembl HomeArchive Ensembl Home
Home > Help & Documentation

Perl API Documentation

Databases and Application Programme Interfaces (APIs)

Ensembl uses MySQL relational databases to store its information. A comprehensive set of Application Programme Interfaces (APIs) serve as a middle-layer between underlying database schemes and more specific application programmes. The APIs aim to encapsulate the database layout by providing efficient high-level access to data tables and isolate applications from data layout changes.

Ensembl's API is written in Perl - installation instructions and full documentation of all modules are available online.

The main Ensembl databases are introduced below. Data releases for these databases can be obtained from the Ensembl FTP site.

Core databases and APIs

The species-specific Core databases store genome sequences and most of the annotation information. This includes the gene, transcript and protein models annotated by the Ensembl automated genome analysis and annotation pipeline.

Core databases also store assembly information, cDNA and protein alignments, external references, markers and repeat regions data sets.

More about the Core databases and APIs...

OtherFeatures databases and APIs

Species-specific OtherFeatures databases hold an independent EST gene set provided for all well-characterised species with a suitable amount of biological evidence.

The layout of OtherFeatures Databases is identical to the Core Database schema so that schema descriptions and API access are equally applicable.

More about the OtherFeatures databases and APIs...

Compara database and APIs

The Compara multi-species database stores the results of genome-wide species comparisons re-calculated for each release.

The comparative genomics set includes pairwise whole genome alignments and synteny regions. The comparative proteomics data set contains orthologue predictions and protein family clusters.

More about the Compara database and API...

Variation databases and APIs

The large amount of genetic variation information is organised in a set of species-specific Variation databases.

More about the Variation databases and APIs...

FuncGen databases and APIs

The funcgen databases store genome-wide functional genomics and regulatory information. These data are used to produce the Ensembl 'Regulatory Build'.

More about the Funcgen databases and APIs...

Ensembl Registry

The Registry system allows to tell your programs where to find the EnsEMBL databases and how to connect to them. It has been implemented for the Ensembl Core and Compara APIs.

More about the Registry...

Ensembl Software Support

Ensembl is an open project and we would like to encourage correspondence and discussions on any subject on any aspect of Ensembl. Please see the Ensembl Contacts page for suitable options for getting in touch with us.

If you are interested in undertaking a short-term collaborative project, our "Geek for a Week" scheme allows developers and researchers to work alongside Ensembl team members. More information...