Raw content of Bio::EnsEMBL::Funcgen::CoordSystem
#
# EnsEMBL module for Bio::EnsEMBL::Funcgen::CoordSystem
#
=head1 NAME
Bio::EnsEMBL::Funcgen::CoordSystem
=head1 SYNOPSIS
my $db = Bio::EnsEMBL::Funcgen::DBSQL::DBAdaptor->new(...);
my $csa = $db->get_CoordSystemAdaptor();
#
# Get default chromosome coord system for the 39_36a DB:
#
my $cs = $csa->fetch_by_name_schema_build_version('chromosome', '39_36a');
my $str = join ':', $cs->name(),$cs->version(),$cs->dbID();
print "$str\n";
=head1 DESCRIPTION
This has been adapted from the core CoordSystem object to accomodate the multi-assembly
aspects of the eFG schema, namely hadnling the schema_build of the referenced core DB.
This is a simple object which contains a few coordinate system attributes:
name, internal identifier, version and schema_build. A coordinate system is
uniquely defined by its name and version and which DB it came from i.e. schema_build.
A version of a coordinate system applies to all sequences within a coordinate system.
This should not be confused with individual sequence versions.
Take for example the Human assembly. The version 'NCBI33' applies to
to all chromosomes in the NCBI33 assembly (that is the entire 'chromosome'
coordinate system). The 'clone' coordinate system in the same database would
have no version however. Although the clone sequences have their own sequence
versions, there is no version which applies to the entire set of clones.
Coordinate system objects are immutable. Their name and version, and other
attributes may not be altered after they are created.
=head1 AUTHOR
This module was written by Nathan Johnson, but was based heavily on the core module
authored by Graham McVicker.
=head1 CONTACT
Post questions to the EnsEMBL development list ensembl-dev@ebi.ac.uk
=head1 METHODS
=cut
use strict;
use warnings;
package Bio::EnsEMBL::Funcgen::CoordSystem;
use Bio::EnsEMBL::Storable;
use Bio::EnsEMBL::Utils::Argument qw(rearrange);
use Bio::EnsEMBL::Utils::Exception qw(throw);
use vars qw(@ISA);
@ISA = qw(Bio::EnsEMBL::Storable);
=head2 new
Arg [..] : List of named arguments:
-NAME - The name of the coordinate system
-VERSION - (optional) The version of the coordinate system.
Note that if the version passed in is undefined,
it will be set to the empty string in the
resulting CoordSystem object.
-RANK - The rank of the coordinate system. The highest
level coordinate system should have rank 1, the
second highest rank 2 and so on. An example of
a high level coordinate system is 'chromosome' an
example of a lower level coordinate system is
'clone'.
-SCHEMA_BUILD - The schema and data build version of the DB of
origin.
-TOP_LEVEL - (optional) Sets whether this is a top-level coord
system. Default = 0. This should only be set to
true if you are creating an artificial toplevel
coordsystem by the name of 'toplevel'
-SEQUENCE_LEVEL - (optional) Sets whether this is a sequence
level coordinate system. Default = 0
-DEFAULT - (optional)
Whether this is the default version of the
coordinate systems of this name. Default = 0
-DBID - (optional) The internal identifier of this
coordinate system
-ADAPTOR - (optional) The adaptor which provides database
interaction for this object
Example : $cs = Bio::EnsEMBL::CoordSystem->new(-NAME => 'chromosome',
-VERSION => 'NCBI33',
-RANK => 1,
-DBID => 1,
-SCHEMA_BUILD => '39_36a',
-ADAPTOR => adaptor,
-DEFAULT => 1,
-SEQUENCE_LEVEL => 0);
Description: Creates a new CoordSystem object representing a coordinate
system.
Returntype : Bio::EnsEMBL::Funcgen::CoordSystem
Exceptions : none
Caller : general
Status : Stable
=cut
sub new {
my $caller = shift;
my $class = ref($caller) || $caller;
my $self = $class->SUPER::new(@_);
#Can we just hadnle schema_build here and call super->new for the rest.
#We will also have to handle the top/default levels issues with multiple DBs
#my ($name, $version, $sbuild, $top_level, $sequence_level, $default, $rank) =
# rearrange(['NAME','VERSION', 'SCHEMA_BUILD','TOP_LEVEL', 'SEQUENCE_LEVEL',
# 'DEFAULT', 'RANK'], @_);
my ($name, $version) = rearrange(['NAME','VERSION'], @_);
throw('A name argument is required') if(! $name);
$version = '' if(!defined($version));
#$top_level = ($top_level) ? 1 : 0;
#$sequence_level = ($sequence_level) ? 1 : 0;
#$default = ($default) ? 1 : 0;
#$rank ||= 0;
#if($top_level) {
# if($rank) {
# throw('RANK argument must be 0 if TOP_LEVEL is 1');
# }
# if($name) {
# if($name ne 'toplevel') {
# throw('The NAME argument must be "toplevel" if TOP_LEVEL is 1')
# }
# } else {
# $name = 'toplevel';
# }
# if($sequence_level) {
# throw("SEQUENCE_LEVEL argument must be 0 if TOP_LEVEL is 1");
# }
# $default = 0;
# } else {
# if(!$rank) {
# throw("RANK argument must be non-zero if not toplevel CoordSystem");
# }
# if($name eq 'toplevel') {
# throw("Cannot name coord system 'toplevel' unless TOP_LEVEL is 1");
# }
# }
# if($rank !~ /^\d+$/) {
# throw('The RANK argument must be a positive integer');
# }
$self->{'core_cache'} = {};
$self->{'version'} = $version;
$self->{'name'} = $name;
#$self->{'schema_build'} = $sbuild;
#$self->{'top_level'} = $top_level;
#$self->{'sequence_level'} = $sequence_level;
#$self->{'default'} = $default;
#$self->{'rank'} = $rank;
return $self;
}
=head2 add_core_coord_system_info
Arg [1] : mandatory hash:
-RANK => $rank,
-SEQUENCE_LEVEL => $seq_lvl,
-DEFAULT => $default,
-SCHEMA_BUILD => $sbuild,
-CORE_COORD_SYSTEM_ID => $ccs_id,
-IS_STORED => $stored_status,
Example : $cs->add_core_coord_system_info(
-RANK => $rank,
-SEQUENCE_LEVEL => $seq_lvl,
-DEFAULT => $default,
-SCHEMA_BUILD => $sbuild,
-CORE_COORD_SYSTEM_ID => $ccs_id,
-IS_STORED => 1,
);
Description: Setter for core coord system information
Returntype : none
Exceptions : throws if:
rank not 0 when toplevel
name not 'TOPLEVEL" when toplevel
sequence level and top level
no schema_build defined
no rank
rank 0 when not toplevel
name 'TOPLEVEL' when not toplevel
Caller : Bio::EnsEMBL::Funcgen::DBSQL::CoordSystemAdaptor and ?
Status : at risk - replace with add_core_CoordSystem? implement top level?
#this does not check name and version!
=cut
sub add_core_coord_system_info {
my ($self) = shift;
my ($sbuild, $top_level, $sequence_level, $default, $rank, $stored, $ccs_id) =
rearrange(['SCHEMA_BUILD','TOP_LEVEL', 'SEQUENCE_LEVEL',
'DEFAULT', 'RANK', 'IS_STORED', 'CORE_COORD_SYSTEM_ID'], @_);
throw('Must provide a schema_build') if ! $sbuild;
throw('Must provide a core_coord_system_id') if ! $ccs_id;
#$top_level = ($top_level) ? 1 : 0;
$sequence_level = ($sequence_level) ? 1 : 0;
$default = ($default) ? 1 : 0;
$stored ||=0;
$rank ||= 0;
if($top_level) {
if($rank) {
throw('RANK argument must be 0 if TOP_LEVEL is 1');
}
if($self->name()) {
if($self->name() ne 'toplevel') {
throw('The NAME argument must be "toplevel" if TOP_LEVEL is 1')
}
} else {
throw('toplevel not yet implemented');
#$name = 'toplevel';
}
if($sequence_level) {
throw("SEQUENCE_LEVEL argument must be 0 if TOP_LEVEL is 1");
}
$default = 0;
} else {
if(!$rank) {
throw("RANK argument must be non-zero if not toplevel CoordSystem");
}
if($self->name() eq 'toplevel') {
throw("Cannot name coord system 'toplevel' unless TOP_LEVEL is 1");
}
}
if($rank !~ /^\d+$/) {
throw('The RANK argument must be a positive integer');
}
$self->{'core_cache'}{$sbuild} = {(
RANK => $rank,
SEQUENCE_LEVEL => $sequence_level,
DEFAULT => $default,
CORE_COORD_SYSTEM_ID => $ccs_id,
IS_STORED => $stored,
)};
return;
}
#remove all but schema_buil and equals?
#depends on how we handle levels
=head2 name
Arg [1] : (optional) string $name
Example : print $coord_system->name();
Description: Getter for the name of this coordinate system
Returntype : string
Exceptions : none
Caller : general
Status : Stable
=cut
sub name {
my $self = shift;
return $self->{'name'};
}
=head2 schema_build
Example : print $coord_system->schema_build();
Description: Getter for the schema_build of this coordinate system
Returntype : string
Exceptions : none
Caller : general
Status : deprecated
=cut
sub schema_build {
my $self = shift;
throw('schema_build deprecated, use contains_schema_build');
return $self->{'schema_build'};
}
=head2 get_latest_schema_build
Example : my $db_schema_build = $coord_system->get_latest_schema_build();
Description: Getter for the most recent schema_build of this coordinate system
Returntype : string
Exceptions : none
Caller : general
Status : at risk
=cut
sub get_latest_schema_build {
my $self = shift;
return (sort (keys %{$self->{'core_cache'}}))[0];
}
=head2 contains_schema_build
Example : if ($coord_system->contains_schema_build('43_36e')){..do some coord system things ..};
Description: Returns true is the CoordSystem maps to the corresponding core CoordSystem
Returntype : Boolean
Exceptions : throws if schema_build not defined
Caller : general
Status : at risk
=cut
sub contains_schema_build {
my ($self, $schema_build) = @_;
throw('Must pass a schema_build') if ! $schema_build;
return (exists $self->{'core_cache'}{$schema_build}) ? 1 : 0;
}
=head2 version
Arg [1] : none
Example : print $coord->version();
Description: Getter/Setter for the version of this coordinate system. This
will return an empty string if no version is defined for this
coordinate system.
Returntype : string
Exceptions : none
Caller : general
Status : Stable
=cut
sub version {
my $self = shift;
return $self->{'version'};
}
=head2 equals
Arg [1] : Bio::EnsEMBL::Funcgen::CoordSystem $cs
The coord system to compare to for equality.
Example : if($coord_sys->equals($other_coord_sys)) { ... }
Description: Compares 2 coordinate systems and returns true if they are
equivalent. The definition of equivalent is sharing the same
name and version.
Returntype : string
Exceptions : none
Caller : general
Status : At risk
=cut
sub equals {
my $self = shift;
my $cs = shift;
if(!$cs || !ref($cs) ||
(! $cs->isa('Bio::EnsEMBL::Funcgen::CoordSystem') &&
! $cs->isa('Bio::EnsEMBL::CoordSystem'))){
throw('Argument must be a Bio::EnsEMBL[::Funcgen]::CoordSystem');
}
#need to add check on schema_build here
#all schema_builds should have been added by BaseFeatureAdaptor during import
#warn $self->{'version'}." eq ".$cs->version()." && ".$self->{'name'}." eq ".$cs->name();#." && ".$self->adaptor->db->_get_schema_build($cs->adaptor())." eq ".$self->schema_build()."\n";
#this fails if we are using two different versions with the same cs's
#can we just restrict it to name and version here, then check for schema_build and add if not present?
#where is equals being called?
if(($self->version() eq $cs->version()) &&
($self->name() eq $cs->name())){
#we need to make sure these are default CS, otherwise we can get into trouble with
#re-used or mismatched seq_region_ids between DB wih different default assemblies
if(! $cs->is_default()){
warn 'You are trying to use a non-default CoordSystem '.$cs->version().' which will have different seq_region_ids to a default CoordSystem of the same version';
return 0;
}
elsif (! $self->contains_schema_build($self->adaptor->db->_get_schema_build($cs->adaptor()))) {
warn 'You are using a schema_build which has no CoordSystem stored for '.$cs->version.'. Defaulting to closest name version match';
}
return 1;
}
return 0;
}
=head2 is_top_level
Arg [1] : none
Example : if($coord_sys->is_top_level()) { ... }
Description: Returns true if this is the toplevel pseudo coordinate system.
The toplevel coordinate system is not a real coordinate system
which is stored in the database, but it is a placeholder that
can be used to request transformations or retrievals to/from
the highest defined coordinate system in a given region.
Returntype : 0 or 1
Exceptions : none
Caller : general
Status : at risk - not implemented yet
=cut
sub is_top_level {
my $self = shift;
throw('Not yet implmented, need to test against the core cache using dnadb/schema_build');
return $self->{'top_level'};
}
=head2 is_sequence_level
Arg [1] : none
Example : if($coord_sys->is_sequence_level()) { ... }
Description: Returns true if this is a sequence level coordinate system
Returntype : 0 or 1
Exceptions : none
Caller : general
Status : at risk - not yet implemented
=cut
sub is_sequence_level {
my $self = shift;
throw('Not yet implemented, need to test against core cache using dnadb/schema_build');
return $self->{'sequence_level'};
}
=head2 is_default
Arg [1] : none
Example : if($coord_sys->is_default()) { ... }
Description: Returns true if this coordinate system is the default
version of the coordinate system of this name.
Returntype : 0 or 1
Exceptions : none
Caller : general
Status : at risk - not yet implemented
=cut
sub is_default {
my $self = shift;
throw('Not yet implemented, need to test against core cache using dnadb/schema_build');
return $self->{'default'};
}
=head2 rank
Arg [1] : none
Example : if($cs1->rank() < $cs2->rank()) {
print $cs1->name(), " is a higher level coord system than",
$cs2->name(), "\n";
}
Description: Returns the rank of this coordinate system. A lower number
is a higher coordinate system. The highest level coordinate
system has a rank of 1 (e.g. 'chromosome'). The toplevel
pseudo coordinate system has a rank of 0.
Returntype : int
Exceptions : none
Caller : general
Status : at risk - not yet implemented
=cut
sub rank {
my $self = shift;
throw('Not yet implemented, need to test against core cache using dnadb/schema_build');
return $self->{'rank'};
}
1;