Bio::EnsEMBL::Compara::DBSQL
AlignSliceAdaptor
Toolbar
Summary
Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor - An AlignSlice can be used to map genes from one species onto another one. This adaptor is used to fetch all the data needed for an AlignSlice from the database.
Package variables
No package variables defined.
Included modules
Inherit
Synopsis
use Bio::EnsEMBL::Registry;
## Load adaptors using the Registry
Bio::EnsEMBL::Registry->load_all();
## Fetch the query slice
my $query_slice_adaptor = Bio::EnsEMBL::Registry->get_adaptor(
"Homo sapiens", "core", "Slice");
my $query_slice = $query_slice_adaptor->fetch_by_region(
"chromosome", "14", 50000001, 50010001);
## Fetch the method_link_species_set
my $mlss_adaptor = Bio::EnsEMBL::Registry->get_adaptor(
"Compara26", "compara", "MethodLinkSpeciesSet");
my $method_link_species_set = $mlss_adaptor->fetch_by_method_link_type_registry_aliases(
"BLASTZ_NET", ["Homo sapiens", "Rattus norvegicus"]);
## Fetch the align_slice
my $align_slice_adaptor = Bio::EnsEMBL::Registry->get_adaptor(
"Compara26",
"compara",
"AlignSlice"
);
my $align_slice = $align_slice_adaptor->fetch_by_Slice_MethodLinkSpeciesSet(
$query_slice,
$method_link_species_set,
"expanded"
);
Description
No description!
Methods
Methods description
Arg[1] : listref $species_order Arg[2] : Bio::EnsEMBL::Compara::GenomicAlignTree $this_tree Arg[3] : Bio::EnsEMBL::Compara::GenomicAlignTree $next_tree Example : Description: This method tries to accommodate the nodes in $next_tree into $species_order. It uses several approaches. If there is information available about left and right node IDs, it will use it to link the nodes. Alternatively, it will rely on the species names to do its best. When a new species name appears in the $next_tree, it will try to insert it in the right position. Returntype : none Exceptions : none Caller : $object->methodname |
Arg[1] : Bio::EnsEMBL::Compara::GenomicAlignBlock $genomic_align_block Arg[2] : [optional] boolean $expanded (def. FALSE) Arg[3] : [optional] boolean $solve_overlapping (def. FALSE) Example : my $align_slice = $align_slice_adaptor->fetch_by_GenomicAlignBlock( $genomic_align_block); Description: Uses this genomic_aling_block to create an AlignSlice. Setting $expanded to anything different from 0 or "" will create an AlignSlice in "expanded" mode. This means that gaps are allowed in the reference species in order to allocate insertions from other species. By default overlapping alignments are ignored. You can choose to reconciliate the alignments by means of a fake alignment setting the solve_overlapping option to TRUE. Returntype : Bio::EnsEMBL::Compara::AlignSlice Exceptions : thrown if arg[1] is not a Bio::EnsEMBL::Compara::GenomicAlignBlock Exceptions : thrown if $genomic_align_block has no method_link_species_set Caller : $object->methodname |
Arg[1] : Bio::EnsEMBL::Slice $query_slice Arg[2] : Bio::EnsEMBL::Compara::MethodLinkSpeciesSet $method_link_species_set Arg[3] : [optional] boolean $expanded (def. FALSE) Arg[4] : [optional] boolean $solve_overlapping (def. FALSE) Arg[5] : [optional] Bio::EnsEMBL::Slice $target_slice Example : my $align_slice = $align_slice_adaptor->fetch_by_Slice_MethodLinkSpeciesSet( $query_slice, $method_link_species_set); Description: Fetches from the database all the data needed for the AlignSlice corresponding to the $query_slice and the given $method_link_species_set. Setting $expanded to anything different from 0 or "" will create an AlignSlice in "expanded" mode. This means that gaps are allowed in the reference species in order to allocate insertions from other species. By default overlapping alignments are ignored. You can choose to reconciliate the alignments by means of a fake alignment setting the solve_overlapping option to TRUE. In order to restrict the AlignSlice to alignments with a given genomic region, you can specify a target_slice. All alignments which do not match this slice will be ignored. Returntype : Bio::EnsEMBL::Compara::AlignSlice Exceptions : thrown if wrong arguments are given Caller : $object->methodname |
Arg[1] : none Example : $align_slice_adaptor->flush_cache() Description: Destroy the cache Returntype : none Exceptions : none Caller : $object->methodname |
Arg : Example : Description: Creates a new AlignSliceAdaptor object Returntype : Bio::EnsEMBL::Compara::DBSQL::AlignSliceAdaptor Exceptions : none Caller : Bio::EnsEMBL::Registry->get_adaptor |
Methods code
sub _combine_genomic_align_trees
{ my ($species_order, $this_tree, $next_tree) = @_;
my $species_counter = 0;
my $existing_node_ids; my $existing_right_node_ids;
my $next_species_names; my $existing_species_names;
foreach my $this_genomic_align_node (@{$next_tree->get_all_sorted_genomic_align_nodes}) {
my $this_node_id = $this_genomic_align_node->node_id;
$existing_node_ids->{$this_node_id} = 1;
push(@$next_species_names, $this_genomic_align_node->genomic_align_group->genome_db->name)
if ($this_genomic_align_node->genomic_align_group and
$this_genomic_align_node->genomic_align_group->genome_db->name ne "Ancestral sequences");
}
foreach my $species_def (@$species_order) {
my $right_node_id = $species_def->{right_node_id};
$existing_right_node_ids->{$right_node_id} = 1 if ($right_node_id);
push(@$existing_species_names, $species_def->{genome_db}->name);
}
foreach my $this_genomic_align_node (@{$next_tree->get_all_sorted_genomic_align_nodes}) {
next if (!@{$this_genomic_align_node->get_all_GenomicAligns});
my $this_genomic_align = $this_genomic_align_node->get_all_GenomicAligns->[0];
my $this_genome_db = $this_genomic_align->genome_db;
my $this_node_id = $this_genomic_align_node->node_id;
my $this_right_node_id = _get_right_node_id($this_genomic_align_node);
my $these_genomic_align_ids = [];
foreach my $each_genomic_align (@{$this_genomic_align_node->get_all_GenomicAligns}) {
push (@$these_genomic_align_ids, $each_genomic_align->dbID);
}
my $match = 0;
while (!$match and $species_counter < @$species_order) {
my $species_genome_db = $species_order->[$species_counter]->{genome_db};
my $species_right_node_id = $species_order->[$species_counter]->{right_node_id};
$match = 1;
if (defined($species_right_node_id) and $species_right_node_id == $this_node_id) {
$species_order->[$species_counter]->{right_node_id} = $this_right_node_id;
push (@{$species_order->[$species_counter]->{genomic_align_ids}}, @$these_genomic_align_ids);
} elsif (defined($species_right_node_id) and exists($existing_node_ids->{$species_right_node_id})) {
splice(@$species_order, $species_counter, 0, {
genome_db => $this_genome_db,
right_node_id => $this_right_node_id,
genomic_align_ids => [@$these_genomic_align_ids],
});
} elsif ($this_genome_db->name eq $species_genome_db->name
and (!defined($species_right_node_id) or
!defined($existing_node_ids->{$species_right_node_id}))
) {
$species_order->[$species_counter]->{right_node_id} = $this_right_node_id;
push (@{$species_order->[$species_counter]->{genomic_align_ids}}, @$these_genomic_align_ids);
} elsif (!defined($existing_right_node_ids->{$this_node_id})
and !grep {$_ eq $this_genome_db->name} @$existing_species_names) {
splice(@$species_order, $species_counter, 0, {
genome_db => $this_genome_db,
right_node_id => $this_right_node_id,
genomic_align_ids => [@$these_genomic_align_ids],
});
} else {
$match = 0;
}
$species_counter++;
shift(@$existing_species_names);
}
if (!$match) {
push(@$species_order, {
genome_db => $this_genome_db,
right_node_id => $this_right_node_id,
genomic_align_ids => [@$these_genomic_align_ids],
});
$species_counter++;
}
shift(@$next_species_names);
}
return; } |
sub _get_right_node_id
{ my ($this_genomic_align_node) = @_;
my $use_right = 1;
$use_right = 1 - $use_right if (!$this_genomic_align_node->root->get_original_strand);
my $neighbour_node;
if ($use_right) {
$neighbour_node = $this_genomic_align_node->right_node;
} else {
$neighbour_node = $this_genomic_align_node->left_node;
}
if ($neighbour_node) {
return $neighbour_node->node_id;
}
return undef;
}
1; } |
sub fetch_by_GenomicAlignBlock
{ my ($self, $genomic_align_block, $expanded, $solve_overlapping) = @_;
throw("[$genomic_align_block] is not a Bio::EnsEMBL::Compara::GenomicAlignBlock")
unless (UNIVERSAL::isa($genomic_align_block, "Bio::EnsEMBL::Compara::GenomicAlignBlock"));
my $method_link_species_set = $genomic_align_block->method_link_species_set();
throw("GenomicAlignBlock [$genomic_align_block] has no MethodLinkSpeciesSet")
unless ($method_link_species_set);
my $reference_genomic_align = $genomic_align_block->reference_genomic_align;
if (!$reference_genomic_align) {
$genomic_align_block->reference_genomic_align($genomic_align_block->get_all_GenomicAligns->[0]);
$reference_genomic_align = $genomic_align_block->reference_genomic_align;
}
my $reference_slice = $reference_genomic_align->get_Slice();
my $key;
if ($genomic_align_block->dbID) {
$key = "gab_".$genomic_align_block->dbID.":".($expanded?"exp":"cond").
":".($solve_overlapping?"fake-overlap":"non-overlap");
} else {
$key = "gab_".$genomic_align_block.":".($expanded?"exp":"cond").
":".($solve_overlapping?"fake-overlap":"non-overlap");
}
return $self->{'_cache'}->{$key} if (defined($self->{'_cache'}->{$key}));
my $align_slice = new Bio::EnsEMBL::Compara::AlignSlice(
-adaptor => $self,
-reference_Slice => $reference_slice,
-Genomic_Align_Blocks => [$genomic_align_block],
-method_link_species_set => $method_link_species_set,
-expanded => $expanded,
-solve_overlapping => $solve_overlapping,
-preserve_blocks => 1,
);
$self->{'_cache'}->{$key} = $align_slice;
return $align_slice; } |
sub fetch_by_Slice_MethodLinkSpeciesSet
{ my ($self, $reference_slice, $method_link_species_set, $expanded, $solve_overlapping, $target_slice) = @_;
throw("[$reference_slice] is not a Bio::EnsEMBL::Slice")
unless ($reference_slice and ref($reference_slice) and
$reference_slice->isa("Bio::EnsEMBL::Slice"));
throw("[$method_link_species_set] is not a Bio::EnsEMBL::Compara::MethodLinkSpeciesSet")
unless ($method_link_species_set and ref($method_link_species_set) and
$method_link_species_set->isa("Bio::EnsEMBL::Compara::MethodLinkSpeciesSet"));
my $key = $reference_slice->name.":".$method_link_species_set->dbID.":".($expanded?"exp":"cond").
":".($solve_overlapping?"fake-overlap":"non-overlap");
if (defined($target_slice)) {
throw("[$target_slice] is not a Bio::EnsEMBL::Slice")
unless ($target_slice and ref($target_slice) and
$target_slice->isa("Bio::EnsEMBL::Slice"));
$key .= ":".$target_slice->name();
}
return $self->{'_cache'}->{$key} if (defined($self->{'_cache'}->{$key}));
my $genomic_align_block_adaptor = $self->db->get_GenomicAlignBlockAdaptor;
my $genomic_align_blocks = $genomic_align_block_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice(
$method_link_species_set,
$reference_slice
);
if (defined($target_slice)) {
my $target_dnafrag = $self->db->get_DnaFragAdaptor->fetch_by_Slice($target_slice);
if (!$target_dnafrag) {
throw("Cannot get a DnaFrag for the target Slice");
}
for (my $i = 0; $i < @$genomic_align_blocks; $i++) {
my $this_genomic_align_block = $genomic_align_blocks->[$i];
my $hits_the_target_slice = 0;
foreach my $this_genomic_align (@{$this_genomic_align_block->get_all_non_reference_genomic_aligns}) {
if ($this_genomic_align->dnafrag->dbID == $target_dnafrag->dbID and
$this_genomic_align->dnafrag_start <= $target_slice->end and
$this_genomic_align->dnafrag_end >= $target_slice->start) {
$hits_the_target_slice = 1;
last;
}
}
if (!$hits_the_target_slice) {
splice(@$genomic_align_blocks, $i, 1);
$i--;
}
}
}
my $genomic_align_trees = ();
my $species_order;
if ($method_link_species_set->method_link_class =~ /GenomicAlignTree/ and @$genomic_align_blocks) {
my $genomic_align_tree_adaptor = $self->db->get_GenomicAlignTreeAdaptor;
foreach my $this_genomic_align_block (@$genomic_align_blocks) {
my $this_genomic_align_tree = $genomic_align_tree_adaptor->
fetch_by_GenomicAlignBlock($this_genomic_align_block);
push(@$genomic_align_trees, $this_genomic_align_tree);
}
my $last_node_id = undef;
my $tree_order;
foreach my $this_genomic_align_tree (@$genomic_align_trees) {
if ($last_node_id) {
$tree_order->{$this_genomic_align_tree->node_id}->{prev} = $last_node_id;
$tree_order->{$last_node_id}->{next} = $this_genomic_align_tree;
}
$last_node_id = $this_genomic_align_tree->node_id;
}
foreach my $this_genomic_align_node (@{$genomic_align_trees->[0]->get_all_sorted_genomic_align_nodes}) {
next if (!@{$this_genomic_align_node->get_all_GenomicAligns});
my $this_genomic_align = $this_genomic_align_node->get_all_GenomicAligns->[0];
my $genome_db = $this_genomic_align->genome_db;
my $this_node_id = $this_genomic_align_node->node_id;
my $right_node_id = _get_right_node_id($this_genomic_align_node);
my $genomic_align_ids = [];
foreach my $each_genomic_align (@{$this_genomic_align_node->get_all_GenomicAligns}) {
push (@$genomic_align_ids, $each_genomic_align->dbID);
}
push(@$species_order,
{
genome_db => $genome_db,
right_node_id => $right_node_id,
genomic_align_ids => $genomic_align_ids,
});
}
$| = 1;
foreach my $this_genomic_align_tree (@$genomic_align_trees) {
my $next_genomic_align_tree = $tree_order->{$this_genomic_align_tree->node_id}->{next};
next if (!$next_genomic_align_tree);
_combine_genomic_align_trees($species_order, $this_genomic_align_tree, $next_genomic_align_tree);
}
}
my $align_slice = new Bio::EnsEMBL::Compara::AlignSlice(
-adaptor => $self,
-reference_Slice => $reference_slice,
-Genomic_Align_Blocks => $genomic_align_blocks,
-Genomic_Align_Trees => $genomic_align_trees,
-species_order => $species_order,
-method_link_species_set => $method_link_species_set,
-expanded => $expanded,
-solve_overlapping => $solve_overlapping,
);
$self->{'_cache'}->{$key} = $align_slice;
return $align_slice; } |
sub flush_cache
{ my ($self) = @_;
foreach my $align_slice (values (%{$self->{'_cache'}})) {
$align_slice->DESTROY;
}
undef $self->{'_cache'}; } |
sub new
{ my $class = shift;
my $self = $class->SUPER::new(@_);
return $self; } |
General documentation
This module inherits attributes and methods from Bio::EnsEMBL::DBSQL::BaseAdaptor
Copyright (c) 2004. EnsEMBL Team
You may distribute this module under the same terms as perl itself
This modules is part of the EnsEMBL project (
)
Questions can be posted to the ensembl-dev mailing list:
ensembl-dev@ebi.ac.uk
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _