Bio::EnsEMBL::Funcgen::DBSQL
BaseFeatureAdaptor
Toolbar
Summary
Bio::EnsEMBL::Funcgen::DBSQL::BaseFeatureAdaptor - An Base class for all
Funcgen FeatureAdaptors, redefines some methods to use the Funcgen DB
Package variables
No package variables defined.
Included modules
Inherit
Synopsis
Abstract class - should not be instantiated. Implementation of
abstract methods must be performed by subclasses.
Description
This is a base adaptor for Funcgen feature adaptors. This base class is simply a way
of eliminating code duplication through the implementation of methods
common to all Funcgen feature adaptors.
Methods
Methods description
Arg [1] : Bio::EnsEMBL::Feature Example : $fs = $a->fetch_all_by_Slice_constraint($slc, 'perc_ident > 5'); Description: Helper function containing some common feature storing functionality Given a Feature this will return a copy (or the same feature if no changes to the feature are needed) of the feature which is relative to the start of the seq_region it is on. The seq_region_id of the seq_region it is on is also returned. This method will also ensure that the database knows which coordinate systems that this feature is stored in. This supercedes teh core method, to trust the slice the feature has been generated on i.e. from the dnadb. Also handles multi-coordsys aspect, generating new coord_system_ids as appropriate Returntype : Bio::EnsEMBL::Feature and the seq_region_id it is mapped to Exceptions : thrown if $slice is not defined Caller : Bio::EnsEMBL::"Type"FeatureAdaptors Status : At risk |
Arg [1] : optional - Bio::EnsEMBL::Slice the slice from which to obtain features Example : $self->build_seq_region_cache(); Description: Builds the seq_region_id translation caches Returntype : None Exceptions : thrown if optional Slice argument is not valid Caller : self Status : At risk - should be private _build_seq_region_cache? Change arg to DBAdaptor? Or remove if we are building the full cache? |
Arg [1] : Bio::EnsEMBL::Slice $slice the slice from which to obtain features Arg [2] : (optional) string $constraint An SQL query constraint (i.e. part of the WHERE clause) Arg [3] : (optional) string $logic_name the logic name of the type of features to obtain Example : $fs = $a->fetch_all_by_Slice_constraint($slc, 'perc_ident > 5'); Description: Returns a listref of features created from the database which are on the Slice defined by $slice and fulfill the SQL constraint defined by $constraint. If logic name is defined, only features with an analysis of type $logic_name will be returned. Returntype : listref of Bio::EnsEMBL::SeqFeatures in Slice coordinates Exceptions : thrown if $slice is not defined Caller : Bio::EnsEMBL::Slice Status : Stable |
Arg [1] : String $external_name An external identifier of the feature to be obtained Arg [2] : (optional) String $external_db_name The name of the external database from which the identifier originates. Example : my @features = @{ $adaptor->fetch_all_by_external_name( 'NP_065811.1') }; Description: Retrieves all features which are associated with an external identifier such as a GO term, Swissprot identifer, etc. Usually there will only be a single feature returned in the list reference, but not always. Features are returned in their native coordinate system, i.e. the coordinate system in which they are stored in the database. If they are required in another coordinate system the Feature::transfer or Feature::transform method can be used to convert them. If no features with the external identifier are found, a reference to an empty list is returned. Returntype : arrayref of Bio::EnsEMBL::Feature objects Exceptions : none Caller : general Status : at risk |
Arg [1] : String $label - display label of feature to fetch Example : my $feat = $adaptor->fetch_by_display_label("BRCA2"); Description: Returns the feature which has the given display label or undef if there is none. If there are more than 1, only the first is reported. Returntype : Bio::EnsEMBL::Funcgen::Feature Exceptions : none Caller : general Status : At risk |
Arg [1] : (optional) string $constraint An SQL query constraint (i.e. part of the WHERE clause) Arg [2] : (optional) Bio::EnsEMBL::AssemblyMapper $mapper A mapper object used to remap features as they are retrieved from the database Arg [3] : (optional) Bio::EnsEMBL::Slice $slice A slice that features should be remapped to Example : $fts = $a->generic_fetch('contig_id in (1234, 1235)', 'Swall'); Description: Wrapper method for core BaseAdaptor, build seq_region cache for features Returntype : ARRAYREF of Bio::EnsEMBL::SeqFeature in contig coordinates Exceptions : none Caller : FeatureAdaptor classes Status : at risk |
Methods code
sub _pre_store
{ my $self = shift;
my $feature = shift;
if(!ref($feature) || !$feature->isa('Bio::EnsEMBL::Feature')) {
throw('Expected Feature argument.');
}
$self->_check_start_end_strand($feature->start(),$feature->end(),
$feature->strand());
my $db = $self->db();
my $slice = $feature->slice();
if(!ref($slice) || !$slice->isa('Bio::EnsEMBL::Slice')) {
throw('Feature must be attached to Slice to be stored.');
}
if($slice->start != 1 || $slice->strand != 1) {
throw("You must generate your feature on a slice starting at 1 with strand 1");
}
my $cs = $slice->coord_system;
my $csa = $self->db->get_FGCoordSystemAdaptor(); my $fg_cs = $csa->validate_and_store_coord_system($cs);
$fg_cs = $csa->fetch_by_name($cs->name(), $cs->version());
my ($tab) = $self->_tables();
my $tabname = $tab->[0];
my $mcc = $db->get_MetaCoordContainer();
$mcc->add_feature_type($fg_cs, $tabname, $feature->length);
$self->build_seq_region_cache($slice);
my $seq_region_id = $self->get_seq_region_id_by_Slice($slice, undef, 1);
if(! $seq_region_id){
$seq_region_id = $self->get_seq_region_id_by_Slice($slice, $fg_cs);
my $schema_build = $self->db->_get_schema_build($slice->adaptor->db());
my $sql;
my @args = ($slice->seq_region_name(), $fg_cs->dbID(), $slice->get_seq_region_id(), $schema_build);
if($seq_region_id) {
$sql = 'insert into seq_region(seq_region_id, name, coord_system_id, core_seq_region_id, schema_build) values (?,?,?,?,?)';
unshift(@args, $seq_region_id);
}
else{
$sql = 'insert into seq_region(name, coord_system_id, core_seq_region_id, schema_build) values (?,?,?,?)';
}
my $sth = $self->prepare($sql);
eval{$sth->execute(@args);};
if(!$@){
$seq_region_id = $sth->{'mysql_insertid'};
}
}
return ($feature, $seq_region_id);
}
} |
sub _remap
{ my ($features, $mapper, $slice) = @_;
if(@$features && (!$features->[0]->isa('Bio::EnsEMBL::Feature') ||
$features->[0]->slice == $slice)) {
return $features;
}
my @out;
my $slice_start = $slice->start();
my $slice_end = $slice->end();
my $slice_strand = $slice->strand();
my $slice_cs = $slice->coord_system();
my ($seq_region, $start, $end, $strand);
my $slice_seq_region = $slice->seq_region_name();
foreach my $f (@$features) {
my $fslice = $f->slice();
if(!$fslice) {
throw("Feature does not have attached slice.\n");
}
my $fseq_region = $fslice->seq_region_name();
my $fseq_region_id = $fslice->get_seq_region_id();
my $fcs = $fslice->coord_system();
if(!$slice_cs->equals($fcs)) {
($seq_region, $start, $end, $strand) =
$mapper->fastmap($fseq_region_id,$f->start(),$f->end(),$f->strand(),$fcs);
next if(!defined $start);
} else {
$start = $f->start();
$end = $f->end();
$strand = $f->strand();
$seq_region = $f->slice->seq_region_name();
}
next if ($start > $slice_end) || ($end < $slice_start) ||
($slice_seq_region ne $seq_region);
if($slice_strand == -1) {
$f->move( $slice_end - $end + 1, $slice_end - $start + 1, $strand * -1 );
} else {
$f->move( $start - $slice_start + 1, $end - $slice_start + 1, $strand );
}
$f->slice($slice);
push @out,$f;
}
return\@ out; } |
sub _slice_fetch
{ my $self = shift;
my $slice = shift;
my $orig_constraint = shift;
my $slice_start = $slice->start();
my $slice_end = $slice->end();
my $slice_strand = $slice->strand();
my $slice_cs = $slice->coord_system();
my @tabs = $self->_tables;
my ($tab_name, $tab_syn) = @{$tabs[0]};
my $mcc = $self->db->get_MetaCoordContainer();
my @feat_css=();
my $mca = $self->db->get_MetaContainer();
my $value_list = $mca->list_value_by_key( $tab_name."build.level" );
if( @$value_list and $slice->is_toplevel()) {
push @feat_css, $slice_cs;
}
else{
@feat_css = @{$mcc->fetch_all_CoordSystems_by_feature_type($tab_name)};
}
my $asma = $self->db->get_AssemblyMapperAdaptor();
my @features;
COORD_SYSTEM: foreach my $feat_cs (@feat_css) {
my $mapper;
my @coords;
my @ids;
if($feat_cs->equals($slice_cs)) {
my $max_len = $self->_max_feature_length() ||
$mcc->fetch_max_length_by_CoordSystem_feature_type($feat_cs,$tab_name);
my $constraint = $orig_constraint;
my $sr_id = $self->get_seq_region_id_by_Slice($slice, $feat_cs);
$constraint .= " AND " if($constraint);
$constraint .=
"${tab_syn}.seq_region_id = $sr_id AND " .
"${tab_syn}.seq_region_start <= $slice_end AND " .
"${tab_syn}.seq_region_end >= $slice_start";
if($max_len) {
my $min_start = $slice_start - $max_len;
$constraint .=
" AND ${tab_syn}.seq_region_start >= $min_start";
}
my $fs = $self->generic_fetch($constraint,undef,$slice);
$fs = _remap($fs, $mapper, $slice);
push @features, @$fs;
}
}
return\@ features;
}
} |
sub build_seq_region_cache
{ my ($self, $slice) = @_;
if(defined $slice){
throw('Optional argument must be a Bio::EnsEMBL::Slice') if(! ( ref($slice) && $slice->isa('Bio::EnsEMBL::Slice')));
}
my $dnadb = (defined $slice) ? $slice->adaptor->db() : $self->db->dnadb();
my $schema_build = $self->db->_get_schema_build($dnadb);
my $sql = 'select sr.core_seq_region_id, sr.seq_region_id from seq_region sr';
my @args = ($schema_build);
if($self->is_multispecies()) {
$sql.= ', coord_system cs where sr.coord_system_id = cs.coord_system_id and cs.species_id=? and';
unshift(@args, $self->species_id());
}
else {
$sql.= ' where';
}
$sql.=' sr.schema_build =?';
$self->{'seq_region_cache'} = {};
$self->{'core_seq_region_cache'} = {};
my $sth = $self->prepare($sql);
$sth->execute(@args);
while(my $ref = $sth->fetchrow_arrayref()) {
$self->{seq_region_cache}->{$ref->[0]} = $ref->[1];
$self->{core_seq_region_cache}->{$ref->[1]} = $ref->[0];
}
$sth->finish();
return;
}
} |
sub fetch_all_by_Gene_FeatureSets
{ my ($self, $gene, $fsets, $dblinks) = @_;
if(! ( ref($gene) && $gene->isa('Bio::EnsEMBL::Gene'))){
throw("You must pass a valid Bio::EnsEMBL::Gene object");
}
my @features = @{$self->fetch_all_by_stable_Storable_FeatureSets($gene, $fsets)};
if($dblinks){
foreach my $transcript(@{$gene->get_all_Transcripts}){
push @features, @{$self->fetch_all_by_Transcript_FeatureSets($transcript, $fsets, $dblinks)};
}
}
return\@ features; } |
sub fetch_all_by_Slice_constraint
{ my($self, $slice, $constraint, $logic_name) = @_;
my @result;
if(!ref($slice) || !$slice->isa("Bio::EnsEMBL::Slice")) {
throw("Bio::EnsEMBL::Slice argument expected.");
}
$constraint ||= '';
my $fg_cs = $self->db->get_FGCoordSystemAdaptor->fetch_by_name(
$slice->coord_system->name(),
$slice->coord_system->version()
);
if(! defined $fg_cs){
warn "No CoordSystem present for ".$slice->coord_system->name().":".$slice->coord_system->version();
return\@ result;
}
$self->build_seq_region_cache($slice);
my @tables = $self->_tables;
my (undef, $syn) = @{$tables[0]};
$constraint = $self->_logic_name_to_constraint($constraint, $logic_name);
return [] if(!defined($constraint));
my $key = uc(join(':', $slice->name, $constraint, $self->db->_get_schema_build($slice->adaptor->db())));
if(exists($self->{'_slice_feature_cache'}->{$key})) {
return $self->{'_slice_feature_cache'}->{$key};
}
my $sa = $slice->adaptor();
my @proj = @{$sa->fetch_normalized_slice_projection($slice)};
if(@proj == 0) {
throw('Could not retrieve normalized Slices. Database contains ' .
'incorrect assembly_exception information.');
}
my $sr_id = $slice->get_seq_region_id();
@proj = grep { $_->to_Slice->get_seq_region_id() != $sr_id } @proj;
my $segment = bless([1,$slice->length(),$slice ],
'Bio::EnsEMBL::ProjectionSegment');
push( @proj, $segment );
my @bounds;
my $ent_slice = $sa->fetch_by_seq_region_id($sr_id);
$ent_slice = $ent_slice->invert() if($slice->strand == -1);
my @ent_proj = @{$sa->fetch_normalized_slice_projection($ent_slice)};
shift @ent_proj; @bounds = map {$_->from_start - $slice->start() + 1} @ent_proj;
foreach my $seg (@proj) {
my $offset = $seg->from_start();
my $seg_slice = $seg->to_Slice();
my $features = $self->_slice_fetch($seg_slice, $constraint);
if($seg_slice->name() ne $slice->name()) {
FEATURE:
foreach my $f (@$features) {
if($offset != 1) {
$f->{'start'} += $offset-1;
$f->{'end'} += $offset-1;
}
foreach my $bound (@bounds) {
if($f->{'start'} < $bound && $f->{'end'} >= $bound) {
next FEATURE;
}
}
$f->{'slice'} = $slice;
push @result, $f;
}
}
else {
push @result, @$features;
}
}
$self->{'_slice_feature_cache'}->{$key} =\@ result;
return\@ result; } |
fetch_all_by_Transcript_FeatureSets | description | prev | next | Top |
sub fetch_all_by_Transcript_FeatureSets
{ my ($self, $transc, $fsets, $dblinks) = @_;
if(! ( ref($transc) && $transc->isa('Bio::EnsEMBL::Transcript'))){
throw("You must pass a valid Bio::EnsEMBL::Transcript object");
}
my @features = @{$self->fetch_all_by_stable_Storable_FeatureSets($transc, $fsets)};
if($dblinks){
my $translation = $transc->translation;
push @features, @{$self->fetch_all_by_stable_Storable_FeatureSets($translation, $fsets)} if $translation;
}
return\@ features; } |
sub fetch_all_by_external_name
{ my ( $self, $external_name, $external_db_name ) = @_;
my $entryAdaptor = $self->db->get_DBEntryAdaptor();
my (@ids);
my @tmp = split/::/, ref($self);
my $feature_type = pop @tmp;
$feature_type =~ s/FeatureAdaptor//;
my $xref_method = 'list_'.lc($feature_type).'_feature_ids_by_extid';
if(! $entryAdaptor->can($xref_method)){
warn "Does not yet accomodate $feature_type feature external names";
return;
}
else{
@ids = $entryAdaptor->$xref_method($external_name, $external_db_name);
}
return $self->fetch_all_by_dbID_list(\@ ids ); } |
fetch_all_by_stable_Storable_FeatureSets | description | prev | next | Top |
sub fetch_all_by_stable_Storable_FeatureSets
{ my ($self, $obj, $fsets) = @_;
my ($extdb_name);
my $dbe_adaptor = $self->db->get_DBEntryAdaptor;
if(ref($obj) && $obj->isa('Bio::EnsEMBL::Storable') && $obj->can('stable_id')){
my @tmp = split/::/, ref($obj);
my $obj_type = pop @tmp;
my $group = $obj->adaptor->db->group;
if (! defined $group){
throw('You must pass a stable Bio::EnsEMBL::Feature with an attached DBAdaptor with the group attribute set');
}
$extdb_name = 'ensembl_'.$group.'_'.$obj_type;
}
else{
throw('Must pass a stable Bio::EnsEMBL::Feature, you passed a '.$obj);
}
if(ref($fsets) ne 'ARRAY' || scalar(@$fsets) == 0){
throw('Must define an array of Bio::EnsEMBL::FeatureSets to extend xref Slice bound. You passed: '.$fsets);
}
my %feature_set_types;
foreach my $fset(@$fsets){
$self->db->is_stored_and_valid('Bio::EnsEMBL::Funcgen::FeatureSet', $fset);
$feature_set_types{$fset->type} ||= [];
push @{$feature_set_types{$fset->type}}, $fset;
}
my @features;
foreach my $fset_type(keys %feature_set_types){
my $adaptor_type = ucfirst($fset_type).'FeatureAdaptor';
next if ref($self) !~ /$adaptor_type/;
my %feature_set_ids;
map $feature_set_ids{$_->dbID} = 1, @{$feature_set_types{$fset_type}};
my $cnt = 0;
foreach my $efg_feature(@{$self->fetch_all_by_external_name($obj->stable_id, $extdb_name)}){
next if ! exists $feature_set_ids{$efg_feature->feature_set->dbID};
push @features, $efg_feature;
}
}
return\@ features; } |
sub fetch_by_display_label
{ my $self = shift;
my $label = shift;
my @tables = $self->_tables();
my $constraint = "x.display_label = '$label'"; my ($feature) = @{ $self->generic_fetch($constraint) };
return $feature;
}
1; } |
sub generic_fetch
{ my $self = shift;
$self->build_seq_region_cache();
return $self->SUPER::generic_fetch(@_); } |
sub get_core_seq_region_id
{ my ($self, $fg_sr_id) = @_;
my $core_sr_id = $self->{'core_seq_region_cache'}{$fg_sr_id};
if(! defined $core_sr_id && exists $self->{'_tmp_core_seq_region_cache'}{$fg_sr_id}){
$self->{'core_seq_region_cache'}{$fg_sr_id} = $self->{'_tmp_core_seq_region_cache'}{$fg_sr_id};
delete $self->{'_tmp_core_seq_region_cache'}{$fg_sr_id};
$core_sr_id = $self->{'core_seq_region_cache'}{$fg_sr_id};
}
return $core_sr_id; } |
get_seq_region_id_by_Slice | description | prev | next | Top |
sub get_seq_region_id_by_Slice
{ my ($self, $slice, $fg_cs, $test_present) = @_;
if(! ($slice && ref($slice) && $slice->isa("Bio::EnsEMBL::Slice"))){
throw('You must provide a valid Bio::EnsEMBL::Slice');
}
my ($core_sr_id, $fg_sr_id);
if( $slice->adaptor() ) {
$core_sr_id = $slice->adaptor()->get_seq_region_id($slice);
}
else {
$core_sr_id = $self->db()->get_SliceAdaptor()->get_seq_region_id($slice);
}
if (exists $self->{'seq_region_cache'}{$core_sr_id}){
$fg_sr_id = $self->{'seq_region_cache'}{$core_sr_id};
}
if(! $fg_sr_id && ref($fg_cs)){
if( ! $fg_cs->isa('Bio::EnsEMBL::Funcgen::CoordSystem')){
throw('Must pass as valid Bio::EnsEMBL::Funcgen:CoordSystem to retrieve seq_region_ids for forwards compatibility, passed '.$fg_cs);
}
my $sql = 'select seq_region_id from seq_region where coord_system_id =? and name =?';
my $sth = $self->prepare($sql);
$sth->execute($fg_cs->dbID(), $slice->seq_region_name());
($fg_sr_id) = $sth->fetchrow_array();
$sth->finish();
$self->{'_tmp_core_seq_region_cache'} = {(
$fg_sr_id => $core_sr_id
)};
}
elsif(! $fg_sr_id && ! $test_present) {
my $schema_build = $self->db->_get_schema_build($slice->adaptor->db());
my $core_cs = $slice->coord_system;
my $sql = 'select distinct(seq_region_id) from seq_region sr, coord_system cs where sr.coord_system_id=cs.coord_system_id and sr.name=? and cs.name =?';
my @args = ($slice->seq_region_name(), $core_cs->name());
if($core_cs->is_top_level()) {
$sql.= ' and cs.version =?';
push(@args, $core_cs->version());
}
if($self->is_multispecies()) {
$sql.=' and cs.species_id=?';
push(@args, $self->species_id());
}
my $sth = $self->prepare($sql);
$sth->execute(@args);
($fg_sr_id) = $sth->fetchrow_array();
$sth->finish();
if(! $fg_sr_id){
throw('Cannot find previously stored seq_region for: '.$core_cs->name.':'.$core_cs->version.':'.$slice->seq_region_name.
"\nYou need to update your eFG seq_regions to match your core DB using: update_DB_for_release.pl\n");
}
warn 'Defaulting to previously store seq_region for: '.$core_cs->name.':'.
$core_cs->version.':'.$slice->seq_region_name.
"\nYou need to update your eFG seq_regions to match your core DB using: update_DB_for_release.pl\n";
}
return $fg_sr_id; } |
General documentation
fetch_all_by_stable_Feature_FeatureSets | Top |
Arg [1] : string - seq_region_name i.e. chromosome name.
Arg [2] : string - seq_region_start of current slice bound
Arg [3] : string - seq_region_end of current slice bound.
Arg [4] : Bio::EnsEMBL::Gene|Transcript|Translation
Arg [5] : arrayref - Bio::EnsEMBL::Funcgen::FeatureSet
Example : ($start, $end) = $self->_set_bounds_by_regulatory_feature_xref
($trans_chr, $start, $end, $transcript, $fsets);
Description: Internal method to set an xref Slice bounds given a list of
FeatureSets.
Returntype : List - ($start, $end);
Exceptions : throw if incorrent args provided
Caller : self
Status : at risk