Bio::EnsEMBL::Compara::DBSQL
ConservationScoreAdaptor
Toolbar
Summary
Bio::EnsEMBL::Compara::DBSQL::ConservationScoreAdaptor - Object adaptor to access data in the conservation_score table
Package variables
Privates (from "my" definitions)
$_pack_size = 4
$_score_index = 0
$_bucket;
$_no_score_value = undef
$PACKED = 1
$_pack_type = "f"
Included modules
Inherit
Synopsis
Connecting to the database using the Registry
use Bio::EnsEMBL::Registry;
my $reg = "Bio::EnsEMBL::Registry";
$reg->load_registry_from_db(-host=>"ensembldb.ensembl.org", -user=>"anonymous");
my $conservation_score_adaptor = $reg->get_adaptor(
"Multi", "compara", "ConservationScore");
Store data in the database
$conservation_score_adaptor->store($conservation_score);
To retrieve score data from the database using the default display_size
$conservation_scores = $conservation_score_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($method_link_species_set, $slice);
To retrieve one score per base in the slice
$conservation_scores = $conservation_score_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($method_link_species_set, $slice, $slice->end-$slice->start+1);
Print the scores
foreach my $score (@$conservation_scores) {
printf("position %d observed %.4f expected %.4f difference %.4f\n", $score->position, $score->observed_score, $score->expected_score, $score->diff_score);
}
A simple example script for extracting scores from a slice can be found in ensembl-compara/scripts/examples/getConservationScores.pl
Description
This module is used to access data in the conservation_score table.
Each score is represented by a Bio::EnsEMBL::Compara::ConservationScore. The position and an observed, expected score and a difference score (expected-observed) is stored for each column in a multiple alignment. Not all bases in an alignment have a score (for example, if there is insufficient coverage) and termed here as 'uncalled'.
In order to speed up processing of the scores over large regions, the scores are stored in the database averaged over window_sizes of 1 (no averaging), 10, 100 and 500. When retrieving the scores, the most appropriate window_size is estimated from the length of the alignment or slice and the number of scores requested, given by the display_size. There is no need to specify the window_size directly. If the number of scores requested (display_size) is smaller than the alignment length or slice length, the scores will be either averaged if display_type = "AVERAGE" or the maximum value taken if display_type = "MAX". Scores in uncalled regions are not returned. If a score for each column in an alignment is required, the display_size should be set to be the same size as the alignment length or slice length.
Methods
Methods description
_fetch_all_by_GenomicAlignBlockId_WindowSize | code | | next | Top |
Arg 1 : integer $genomic_align_block_id Arg 2 : integer $window_size Arg 3 : (opt) boolean $packed (default 0) Example : my $conservation_scores = $conservation_score_adaptor->_fetch_all_by_GenomicAlignBlockId(23134); Description: Retrieve the corresponding Bio::EnsEMBL::Compara::ConservationScore objects. Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConservationScore objects. If $packed is true, return the scores in a packed format given by $_pack_size and $_pack_type. Exceptions : none Caller : general |
Arg 1 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Example : my ($min, $max) = _find_min_max_score($scores); Description: find the min and max scores used for y axis scaling Returntype : (float, float) Exceptions : Caller : general Status : At risk |
Arg 1 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Arg 2 : int $num_scores (number of scores in the array) Arg 3 : int $score_lengths number of scores in each row of the array Arg 4 : int $pos (position to find) Arg 5 : int $win_size (window size) Example : $exp_scores = _unpack_scores($scores); Description: find the score index (row) that contains $pos in alignment coords Returntype : int Exceptions : none Caller : general Status : At risk |
Arg 1 : string $cigar_line (cigar string from current alignment block) Arg 2 : int $start_region (start of genomic_align_block (chr coords)) Arg 3 : int $end_region (end of genomic_align_block (chr coords)) Arg 4 : int $start_slice (start of slice (chr coords) Arg 5 : int $end_slice (end of slice (chr coords)) Arg 6 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Arg 7 : int $genomic_align_block_id (genomic align block id of current alignment block) Arg 8 : int $genomic_align_block_length (length of current alignment block) Arg 9 : string $display_type (either AVERAGE or MAX (plot average or max value)) Arg 10 : int $win_size (window size used) Arg 11 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores in slice coords
Example : $scores = $self->_get_aligned_scores_from_cigar_line($genomic_align->cigar_line, $genomic_align->dnafrag_start, $genomic_align->dnafrag_end, $slice->start, $slice->end, $conservation_scores, $genomic_align_block->dbID, $genomic_align_block->length, $display_type, $window_size, $scores);
Description: Convert conservation scores from alignment coordinates into species specific chromosome (slice) coordinates for an alignment genomic_align_block
Returntype : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores
Exceptions : none
Caller : general
Status : At risk |
Arg 1 : string $cigar_line (cigar string from current alignment block) Arg 2 : int $start_region (start of genomic_align_block (chr coords)) Arg 3 : int $end_region (end of genomic_align_block (chr coords)) Arg 4 : int $start_slice (start of slice (chr coords) Arg 5 : int $end_slice (end of slice (chr coords)) Arg 6 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Arg 7 : int $genomic_align_block_id (genomic align block id of current alignment block) Arg 8 : int $genomic_align_block_length (length of current alignment block) Arg 9 : string $display_type (either AVERAGE or MAX (plot average or max value)) Arg 10 : int $win_size (window size used) Arg 11 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores in slice coords
Example : $scores = $self->_get_aligned_scores_from_cigar_line_fast($genomic_align->cigar_line, $genomic_align->dnafrag_start, $genomic_align->dnafrag_end, $slice->start, $slice->end, $conservation_scores, $genomic_align_block->dbID, $genomic_align_block->length, $display_type, $window_size, $scores);
Description: Faster method to than _get_aligned_scores_from_cigar_line. This
method does not bin the scores and can be used if only require
one score per base in the alignment
Returntype : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores
Exceptions : none
Caller : general
Status : At risk |
Arg 1 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Arg 2 : int $align_start (start position in alignment coords) Arg 3 : int $align_end (end position in alignment coords) Arg 4 : string $display_type (either AVERAGE or MAX (plot average or max value)) Arg 5 : int $win_size (window size used) Arg 6 : ref to Bio::EnsEMBL::Compara::GenomicAlignBlock object Example : $scores = $self->_get_alignment_scores($conservation_scores, 1, 100000, "AVERAGE", 10, $genomic_align_block); Description: get scores for an alignment in alignment coordinates Returntype : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores in alignment coordinates Exceptions : none Caller : general Status : At risk |
Arg 1 : ref to Bio::EnsEMBL::Compara::MethodLinkSpeciesSet object Arg 2 : ref to Bio::EnsEMBL::Slice object Example : my $light_genomic_aligns = $self->_get_all_ref_genomic_aligns($ma_mlss, $slice); Description: Retrieve from the database some genomic_align information relating to only the slice species. Returntype : listref of hash containing a subset of genomic_align fields Exceptions : none Caller : general Status : At risk |
Arg 1 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Arg 2 : boolean $packed (0 if not packed, 1 if packed) Example : $conservation_scores = _reverse($conservation_scores); Description: print scores (unpack first if necessary) Returntype : none Exceptions : none Caller : general Status : At risk |
Arg 1 : listref of Bio::EnsEMBL::Compara::ConservationScore objects $scores Arg 2 : int $genomic_align_block_length (number of scores) Example : $conservation_scores = _reverse($conservation_scores); Description: reverse the conservation scores for complemented sequences Returntype : listref of Bio::EnsEMBL::Compara::ConservationScore objects Exceptions : Caller : general Status : At risk |
Arg 1 : string $scores Example : $exp_scores = _unpack_scores($scores); Description: unpack score values retrieved from a database Returntype : space delimited string of floats Exceptions : none Caller : general Status : At risk |
Arg 1 : Bio::EnsEMBL::Compara::GenomicAlignBlock $genomic_align_block Arg 2 : (opt) integer $align_start (default 1) Arg 3 : (opt) integer $align_end (default $genomic_align_block->length) Arg 4 : (opt) integer $slice_length (default $genomic_align_block->length) Arg 5 : (opt) integer $display_size (default 700) Arg 6 : (opt) string $display_type (one of "AVERAGE" or "MAX") (default "AVERAGE") Arg 7 : (opt) integer $window_size Example : my $conservation_scores = $conservation_score_adaptor->fetch_all_by_GenomicAlignBlock($genomic_align_block, $align_start, $align_end, $slice_length, $slice_length); Description: Retrieve the corresponding Bio::EnsEMBL::Compara::ConservationScore objects. Each conservation score object contains a position in alignment coordinates, the observed_score, the expected_score and the diff_score (conservation score) calculated as (expected_score - observed_score). The $align_start and $align_end parameters give the start and end of a region within a genomic_align_block and should be in alignment coordinates. The $slice_length is the total length of the region to be displayed and may span several individual genomic align blocks. It is used to automatically calculate the window_size. Display_size is the number of scores that will be returned. If the $slice_length is larger than the $display_size, the scores will either be averaged if the display_type is "AVERAGE" or the maximum taken if display_type is "MAXIMUM". Window_size defines which set of pre-averaged scores to use. Valid values are 1, 10, 100 or 500. There is no need to define the window_size because the program will select the most appropriate window_size to use based on the slice_length and the display_size. Alignment positions which have no scores are not returned. The min and max y axis values for the array of conservation score objects are set in the first conservation score object (index 0). Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConservationScore objects. Caller : object::methodname Status : At risk |
Arg 1 : Bio::EnsEMBL::Compara::MethodLinkSpeciesSet $method_link_species_set Arg 2 : Bio::EnsEMBL::Slice $slice Arg 3 : (opt) integer $display_size (default 700) Arg 4 : (opt) string $display_type (one of "AVERAGE" or "MAX") (default "AVERAGE") Arg 5 : (opt) integer $window_size Exceptions : warning if window_size is not valid Example : my $conservation_scores = $conservation_score_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($method_link_species_set, $slice, $slice->end-$slice->start+1); Description: Retrieve the corresponding Bio::EnsEMBL::Compara::ConservationScore objects. Each conservation score object contains a position in slice coordinates, the observed_score, the expected_score and the diff_score (or conservation score) calculated as the (expected_score - observed_score). The method_link_species_set is obtained using the method_link type of "GERP_CONSERVATION_SCORE". For example, this could be obtained for the 10 way PECAN alignment, using: my $mlss = $mlss_adaptor->fetch_by_method_link_type_registry_aliases("GERP_CONSERVATION_SCORE", ["human", "chimp", "rhesus", "cow", "dog", "mouse", "rat", "opossum", "platypus", "chicken"]);
Display_size defines the number of scores that will be returned. If the slice length is larger than the display_size, the scores
will either be averaged if the display_type is "AVERAGE" or the
maximum taken if display_type is "MAXIMUM".
Window_size defines which set of pre-averaged scores to use.
Valid values are 1, 10, 100 or 500 although there is no need to
define the window_size because the program will select the most
appropriate window_size to use based on the slice length and the
display_size for example, a slice length of 1000000 and
display_size of 1000 will automatically use a window_size of 500.
Slice positions which have no scores are not returned.
The min and max y axis values for the array of
conservation score objects are set in the first conservation
score object (index 0).
Returntype : ref. to an array of Bio::EnsEMBL::Compara::ConservationScore objects.
Caller : object::methodname
Status : At risk |
Arg [1] : Bio::EnsEMBL::Compara::ConservationScore $cs Example : $csa->store($cs); Description: Stores a conservation score object in the compara database if it has not been stored already. Returntype : none Exceptions : thrown if $genomic_align_block is not a Bio::EnsEMBL::Compara::GenomicAlignBlock object Exceptions : thrown if the argument is not a Bio::EnsEMBL::Compara:ConservationScore Caller : general Status : At risk |
Methods code
_add_to_bucket | description | prev | next | Top |
sub _add_to_bucket
{
my ($self, $display_type, $exp_score, $diff_score, $chr_pos, $start_slice, $num_buckets, $genomic_align_block_id, $win_size) = @_;
my $p = 0;
my $s;
my $final_exp_score;
my $final_diff_score;
my $filled_bucket = 0;
if (defined($diff_score) && $diff_score == 0) {
$diff_score = $_no_score_value;
}
if ($_bucket->{cnt} == 0) {
$_bucket->{start_pos} = $chr_pos - $start_slice + 1;
$_bucket->{start_seq_region_pos} = $chr_pos;
if ($display_type eq "AVERAGE") {
$_bucket->{exp_score} = 0;
$_bucket->{diff_score} = 0;
} else {
$_bucket->{exp_score} = $exp_score;
$_bucket->{diff_score} = $diff_score;
}
}
my $end_pos = $chr_pos - $start_slice + 1;
my $end_seq_region_pos = $chr_pos;
if ($display_type eq "AVERAGE") {
if (defined $_no_score_value) {
if ($diff_score != $_no_score_value) {
$_bucket->{exp_score} += $exp_score;
$_bucket->{diff_score} += $diff_score;
$_bucket->{called}++;
}
} else {
if (defined $diff_score) {
$_bucket->{exp_score} += $exp_score;
$_bucket->{diff_score} += $diff_score;
$_bucket->{called}++;
}
}
$_bucket->{cnt}++;
my $num_align_scores = floor($end_pos/ $_bucket->{size});
if ($num_align_scores > $_bucket->{current}) {
$_bucket->{current} = $num_align_scores;
$p = int(($end_pos + $_bucket->{start_pos})/2); $s = int(($end_seq_region_pos + $_bucket->{start_seq_region_pos})/2); if ($_bucket->{called} == 0) {
$final_exp_score = $_no_score_value;
$final_diff_score = $_no_score_value;
} else {
$final_exp_score = $_bucket->{exp_score}/$_bucket->{cnt}; $final_diff_score = $_bucket->{diff_score}/$_bucket->{cnt}; }
$filled_bucket = 1;
}
} else {
if (!defined $_bucket->{diff_score} && defined($diff_score)) {
$_bucket->{diff_score} = $diff_score;
$_bucket->{exp_score} = $exp_score;
}
if (defined($diff_score) && $_bucket->{diff_score} < $diff_score) {
$_bucket->{diff_score} = $diff_score;
$_bucket->{exp_score} = $exp_score;
}
$_bucket->{cnt}++;
if ($end_pos >= ($_bucket->{size} * ($num_buckets+1))) {
$p = int(($end_pos + $_bucket->{start_pos})/2); $s = int(($end_seq_region_pos + $_bucket->{start_seq_region_pos})/2);
$final_exp_score = $_bucket->{exp_score};
$final_diff_score = $_bucket->{diff_score};
$filled_bucket = 1;
}
}
if ($filled_bucket) {
my $aligned_score = Bio::EnsEMBL::Compara::ConservationScore->new_fast(
{'adaptor' => $self,
'genomic_align_block_id' => $genomic_align_block_id,
'window_size' => $win_size,
'position' => ($p or 1),
'seq_region_pos' => $s,
'diff_score' => $final_diff_score,
'expected_score' => $final_exp_score}
);
$_bucket->{exp_score} = 0;
$_bucket->{diff_score} = 0;
$_bucket->{cnt} = 0;
$_bucket->{called} = 0;
$filled_bucket = 0;
return $aligned_score;
}
return 0; } |
sub _fetch_all_by_GenomicAlignBlockId_WindowSize
{ my ($self, $genomic_align_block_id, $window_size, $packed) = @_;
my $conservation_scores = [];
my $exp_scores;
my $diff_scores;
if (!defined $packed) {
$packed = 0;
}
my $sql = qq{
SELECT
genomic_align_block_id,
window_size,
position,
expected_score,
diff_score
FROM
conservation_score
WHERE
genomic_align_block_id = ?
AND
window_size = ?
};
my $sth = $self->prepare($sql);
$sth->execute($genomic_align_block_id, $window_size);
my $conservation_score;
while (my @values = $sth->fetchrow_array()) {
if (!$packed) {
$exp_scores = _unpack_scores($values[3]);
$diff_scores = _unpack_scores($values[4]);
} else {
$exp_scores = $values[3];
$diff_scores = $values[4];
}
$conservation_score = Bio::EnsEMBL::Compara::ConservationScore->new_fast(
{'adaptor' => $self,
'genomic_align_block_id' => $values[0],
'window_size' => $values[1],
'position' => ($values[2] or 1),
'expected_score' => $exp_scores,
'diff_score' => $diff_scores,
'packed' => $packed});
push(@$conservation_scores, $conservation_score);
}
my @sorted_scores = sort {$a->{position} <=> $b->{position}} @$conservation_scores;
return\@ sorted_scores; } |
sub _find_min_max_score
{ my ($scores) = @_;
my $min;
my $max;
foreach my $score (@$scores) {
if (defined $score->diff_score) {
unless (defined $min) {
$min = $score->diff_score;
$max = $score->diff_score;
}
if ($min > $score->diff_score) {
$min = $score->diff_score;
}
if ($max < $score->diff_score) {
$max = $score->diff_score;
}
}
}
return ($min, $max); } |
sub _find_score_index
{ my ($scores, $num_scores, $score_lengths, $pos, $win_size) = @_;
my $i;
my $length;
if ($pos < $scores->[0]->{position} && $pos > ($scores->[0]->{position} - $win_size)) {
return 0;
}
for ($i = $_score_index; $i < $num_scores; $i++) {
my $this_position = $scores->[$i]->{position};
$length = ($score_lengths->[$i] - 1) * $win_size;
if ($pos >= $this_position && $pos <= $this_position + $length) {
$_score_index = $i;
return ($i);
}
if ($pos < ($this_position + $length)) {
$_score_index = $i;
return -1;
}
}
return -1; } |
sub _get_aligned_scores_from_cigar_line
{ my ($self, $cigar_line, $start_region, $end_region, $start_slice, $end_slice, $scores, $genomic_align_block_id, $genomic_align_block_length, $display_type, $win_size, $aligned_scores) = @_;
return undef if (!$cigar_line);
my $num_aligned_scores = scalar(@$aligned_scores);
my @cig = ( $cigar_line =~ /(\d*[GMDIX])/g );
my $align_start = 1;
my $align_end = $genomic_align_block_length;
my $aligned_score;
my $cs_index; my $num_scores = scalar(@$scores);
my $total_pos;
my $total_chr_pos = $start_region;
my $current_pos; my $chr_pos = $start_region; my $prev_position = 0;
my $cigType; my $cigLength;
my $i;
my $csBlockCnt; my $diff_score; my @diff_scores;
my $exp_score; my @exp_scores;
my $chr_start = $start_region;
my $chr_end = $end_region;
if ($start_slice > $start_region) {
$chr_start = $start_slice;
}
if ($end_slice < $end_region) {
$chr_end = $end_slice;
}
my $score_lengths;
for (my $j = 0; $j < $num_scores; $j++) {
my $length = 0;
if (defined($scores->[$j]->diff_score)) {
if ($PACKED) {
$length = length($scores->[$j]->diff_score)/$_pack_size; } else {
my @split_scores = split ' ', $scores->[$j]->diff_score;
$length = scalar(@split_scores);
}
}
push (@$score_lengths, $length);
}
my $prev_chr_pos = $_bucket->{start_seq_region_pos};
for (my $i = $prev_chr_pos+$win_size; $i < $chr_start; $i+=$win_size) {
$aligned_score = _add_to_bucket($self, $display_type, $_no_score_value, $_no_score_value, $i, $start_slice, scalar(@$aligned_scores), $genomic_align_block_id, $win_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
}
}
while ($total_chr_pos <= $chr_start) {
my $cigElem = $cig[$i++];
$cigType = substr( $cigElem, -1, 1 );
$cigLength = substr( $cigElem, 0 ,-1 );
$cigLength = 1 unless ($cigLength =~ /^\d+$/);
$current_pos += $cigLength;
$total_pos += $cigLength;
if( $cigType eq "M" ) {
$total_chr_pos += $cigLength;
}
}
my $start_offset = $total_chr_pos - $chr_start;
if ($cigType eq "M") {
$align_start = (int(($total_pos - $start_offset + $win_size)/$win_size) * $win_size); }
$chr_pos = $chr_start;
for ($current_pos = $align_start; $current_pos <= $align_end && $chr_pos < $chr_end; $current_pos += $win_size) {
$cs_index = _find_score_index($scores, $num_scores, $score_lengths, $current_pos, $win_size);
unless ($cs_index == -1) {
$csBlockCnt = int(($current_pos - $scores->[$cs_index]->{position})/$win_size);
my $value;
if ($PACKED) {
$value = substr $scores->[$cs_index]->expected_score, $csBlockCnt*$_pack_size, $_pack_size;
$exp_score = unpack($_pack_type, $value);
$value = substr $scores->[$cs_index]->diff_score, $csBlockCnt*$_pack_size, $_pack_size;
$diff_score = unpack($_pack_type, $value);
} else {
@exp_scores = split ' ', $scores->[$cs_index]->exp_score;
$exp_score = $exp_scores[$csBlockCnt];
@diff_scores = split ' ', $scores->[$cs_index]->diff_score;
$diff_score = $diff_scores[$csBlockCnt];
}
}
while ($total_pos < $current_pos && $chr_pos < $chr_end) {
my $cigElem = $cig[$i++];
$cigType = substr( $cigElem, -1, 1 );
$cigLength = substr( $cigElem, 0 ,-1 );
$cigLength = 1 unless ($cigLength =~ /^\d+$/);
$total_pos += $cigLength;
if( $cigType eq "M" ) {
$total_chr_pos += $cigLength;
}
}
if ($cigType eq "M") {
$chr_pos = $total_chr_pos - ($total_pos - $current_pos + 1);
} else {
$chr_pos = $total_chr_pos - 1;
}
if ($cigType eq "M") {
if ($cs_index == -1) {
$aligned_score = _add_to_bucket($self, $display_type, $_no_score_value,$_no_score_value, $chr_pos, $start_slice, scalar(@$aligned_scores), $genomic_align_block_id, $win_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
}
} else {
$aligned_score = _add_to_bucket($self, $display_type, $exp_score, $diff_score, $chr_pos, $start_slice, scalar(@$aligned_scores), $genomic_align_block_id, $win_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
}
}
} else {
if ($prev_position != $chr_pos) {
if ($cs_index == -1) {
$aligned_score = _add_to_bucket($self, $display_type, $_no_score_value, $_no_score_value, $chr_pos, $start_slice, scalar(@$aligned_scores), $genomic_align_block_id, $win_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
}
} else {
$aligned_score = _add_to_bucket($self, $display_type, $exp_score, $diff_score, $chr_pos, $start_slice, scalar(@$aligned_scores), $genomic_align_block_id, $win_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
}
}
}
}
$prev_position = $chr_pos;
}
return $aligned_scores; } |
sub _get_aligned_scores_from_cigar_line_fast
{ my ($self, $cigar_line, $start_region, $end_region, $start_slice, $end_slice, $scores, $genomic_align_block_id, $genomic_align_block_length, $display_type, $win_size, $aligned_scores) = @_;
return undef if (!$cigar_line);
my $num_aligned_scores = scalar(@$aligned_scores);
my @cig = ( $cigar_line =~ /(\d*[GMDIX])/g );
my $align_start = 1;
my $align_end = $genomic_align_block_length;
my $aligned_score;
my $cs_index; my $num_scores = scalar(@$scores);
my $total_pos;
my $total_chr_pos = $start_region;
my $current_pos; my $chr_pos = $start_region; my $prev_position = 0;
my $cigType; my $cigLength;
my $i;
my $csBlockCnt; my $diff_score; my @diff_scores;
my $exp_score; my @exp_scores;
my $chr_start = $start_region;
my $chr_end = $end_region;
if ($start_slice > $start_region) {
$chr_start = $start_slice;
}
if ($end_slice < $end_region) {
$chr_end = $end_slice;
}
my $score_lengths;
for (my $j = 0; $j < $num_scores; $j++) {
my $length = 0;
if (defined($scores->[$j]->diff_score)) {
if ($PACKED) {
$length = length($scores->[$j]->diff_score)/$_pack_size; } else {
my @split_scores = split ' ', $scores->[$j]->diff_score;
$length = scalar(@split_scores);
}
}
push (@$score_lengths, $length);
}
while ($total_chr_pos <= $chr_start) {
my $cigElem = $cig[$i++];
$cigType = substr( $cigElem, -1, 1 );
$cigLength = substr( $cigElem, 0 ,-1 );
$cigLength = 1 unless ($cigLength =~ /^\d+$/);
$current_pos += $cigLength;
$total_pos += $cigLength;
if( $cigType eq "M" ) {
$total_chr_pos += $cigLength;
}
}
my $start_offset = $total_chr_pos - $chr_start;
if ($cigType eq "M") {
$align_start = (int(($total_pos - $start_offset + $win_size)/$win_size) * $win_size); }
$chr_pos = $chr_start;
for ($current_pos = $align_start; $current_pos <= $align_end && $chr_pos < $chr_end; $current_pos += $win_size) {
$cs_index = _find_score_index($scores, $num_scores, $score_lengths, $current_pos, $win_size);
unless ($cs_index == -1) {
$csBlockCnt = int(($current_pos - $scores->[$cs_index]->{position})/$win_size);
my $value;
if ($PACKED) {
$value = substr $scores->[$cs_index]->expected_score, $csBlockCnt*$_pack_size, $_pack_size;
$exp_score = unpack($_pack_type, $value);
$value = substr $scores->[$cs_index]->diff_score, $csBlockCnt*$_pack_size, $_pack_size;
$diff_score = unpack($_pack_type, $value);
} else {
@exp_scores = split ' ', $scores->[$cs_index]->exp_score;
$exp_score = $exp_scores[$csBlockCnt];
@diff_scores = split ' ', $scores->[$cs_index]->diff_score;
$diff_score = $diff_scores[$csBlockCnt];
}
}
while ($total_pos < $current_pos && $chr_pos < $chr_end) {
my $cigElem = $cig[$i++];
$cigType = substr( $cigElem, -1, 1 );
$cigLength = substr( $cigElem, 0 ,-1 );
$cigLength = 1 unless ($cigLength =~ /^\d+$/);
$total_pos += $cigLength;
if( $cigType eq "M" ) {
$total_chr_pos += $cigLength;
}
}
if ($cigType eq "M") {
$chr_pos = $total_chr_pos - ($total_pos - $current_pos + 1);
} else {
$chr_pos = $total_chr_pos - 1;
}
if ($cigType eq "M") {
if ($cs_index != -1) {
if (defined($diff_score) && $diff_score == 0) {
$diff_score = $_no_score_value;
$exp_score = $_no_score_value;
}
$aligned_score = Bio::EnsEMBL::Compara::ConservationScore->new_fast(
{'adaptor' => $self,
'genomic_align_block_id' => $genomic_align_block_id,
'window_size' => $win_size,
'position' => $chr_pos - $start_slice + 1,
'seq_region_pos' => $chr_pos,
'diff_score' => $diff_score,
'expected_score' => $exp_score}
);
push(@$aligned_scores, $aligned_score);
}
} else {
if ($prev_position != $chr_pos) {
if ($cs_index != -1) {
$aligned_score = Bio::EnsEMBL::Compara::ConservationScore->new_fast(
{'adaptor' => $self,
'genomic_align_block_id' => $genomic_align_block_id,
'window_size' => $win_size,
'position' => $chr_pos - $start_slice + 1,
'seq_region_pos' => $chr_pos,
'diff_score' => $diff_score,
'expected_score' => $exp_score}
);
push(@$aligned_scores, $aligned_score);
}
}
}
$prev_position = $chr_pos;
}
return $aligned_scores; } |
sub _get_alignment_scores
{ my ($self, $conservation_scores, $align_start, $align_end, $display_type, $window_size, $genomic_align_block) = @_;
my $num_rows = scalar(@$conservation_scores);
my @exp_scores;
my $exp_score;
my @diff_scores;
my $diff_score;
my $aligned_scores = [];
my $pos;
my $genomic_align = $genomic_align_block->reference_genomic_align;
my $i = 0;
my $total_chr_pos = $genomic_align->dnafrag_start;
my $total_pos;
my $start_uncalled_region = 0;
my $end_uncalled_region = 0;
my $score_lengths;
my $start_offset = 0;
my $end_offset = 0;
my $start = -1;
my $end = -1;
for (my $j = 0; $j < $num_rows; $j++) {
my $length = 0;
if (defined($conservation_scores->[$j]->diff_score)) {
if ($PACKED) {
$length = length($conservation_scores->[$j]->diff_score)/$_pack_size; } else {
my @split_scores = split ' ', $conservation_scores->[$j]->diff_score;
$length = scalar(@split_scores);
}
}
$length = ($length-1) * $window_size;
if ($start == -1 && $align_start < $conservation_scores->[0]->{position}) {
$start = 0;
$start_offset = 0;
}
if ($start == -1 && $align_start >= $conservation_scores->[$j]->{position} && $align_start <= $conservation_scores->[$j]->{position} + $length) {
$start= $j;
$start_offset= ($align_start - $conservation_scores->[$j]->{position})/$window_size; }
if ($start == -1 && $align_start < ($conservation_scores->[$j]->{position})) {
$start= $j;
$start_offset = 0;
$start_uncalled_region = 1;
}
if ($align_end >= $conservation_scores->[$j]->{position} && $align_end <= $conservation_scores->[$j]->{position} + $length) {
$end= $j;
$end_offset= int(($align_end - $conservation_scores->[$j]->{position})/$window_size); last;
}
if ($align_end < ($conservation_scores->[$j]->{position})) {
$end= $j-1;
$end_offset = 0;
$end_uncalled_region = 1;
last;
}
}
if ($end == -1) {
$end = $num_rows-1;
$end_offset = int(($align_end - $conservation_scores->[$end]->{position})/$window_size); }
my $genomic_align_block_id = $genomic_align_block->dbID;
for (my $i = $start; $i <= $end; $i++) {
my $num_scores = 0;
if (defined($conservation_scores->[$i]->diff_score)) {
if ($PACKED) {
$num_scores = length($conservation_scores->[$i]->diff_score)/$_pack_size; } else {
@exp_scores = split ' ', $conservation_scores->[$i]->exp_score;
@diff_scores = split ' ', $conservation_scores->[$i]->diff_score;
$num_scores = scalar(@diff_scores);
}
}
if ($i == $end && !$end_uncalled_region) {
if ($end_offset+1 < $num_scores) {
$num_scores = $end_offset+1;
}
}
$pos = $conservation_scores->[$i]->{position};
for (my $j = int($start_offset); $j < $num_scores; $j++) {
$pos += ($start_offset*$window_size);
$start_offset = 0;
if ($PACKED) {
my $value;
$value = substr $conservation_scores->[$i]->expected_score, $j*$_pack_size, $_pack_size;
$exp_score = unpack($_pack_type, $value);
$value = substr $conservation_scores->[$i]->diff_score, $j*$_pack_size, $_pack_size;
$diff_score = unpack($_pack_type, $value);
} else {
$exp_score = $exp_scores[$j];
$diff_score = $diff_scores[$j];
}
my $aligned_score = 0;
$aligned_score = _add_to_bucket($self, $display_type, $exp_score, $diff_score, $pos - $align_start + 1, 1, scalar(@$aligned_scores), $genomic_align_block_id, $window_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
}
$pos+=$window_size;
}
my $next_pos;
if ($i < $end) {
$next_pos = $conservation_scores->[$i+1]->{position};
} else {
$next_pos = $align_end+1;
}
for (my $j = $pos; $j < $next_pos; $j+=$window_size) {
my $aligned_score = _add_to_bucket($self, $display_type, $_no_score_value, $_no_score_value, ($j - $align_start + 1), 1, scalar(@$aligned_scores), $genomic_align_block_id, $window_size);
if ($aligned_score) {
push(@$aligned_scores, $aligned_score);
last;
}
}
}
$i = 0;
while ($i < scalar(@$aligned_scores)) {
if (!defined($_no_score_value) &&
!defined($aligned_scores->[$i]->diff_score)) {
splice @$aligned_scores, $i, 1;
} elsif (defined($_no_score_value) &&
$aligned_scores->[$i]->diff_score == $_no_score_value) {
splice @$aligned_scores, $i, 1;
} else {
$i++;
}
}
return $aligned_scores; } |
sub _get_all_ref_genomic_aligns
{ my ($self, $mlss, $slice) = @_;
my $light_genomic_aligns = [];
my $slice_adaptor = $slice->adaptor();
if(!$slice_adaptor) {
warning("Slice has no attached adaptor. Cannot get Compara alignments.");
return $light_genomic_aligns;
}
my $gdb_a = $self->db->get_GenomeDBAdaptor();
my $meta_container = $slice->adaptor->db->get_MetaContainer();
my $primary_species_name = $gdb_a->get_species_name_from_core_MetaContainer($meta_container);
my ($highest_cs) = @{$slice_adaptor->db->get_CoordSystemAdaptor->fetch_all()};
my $primary_species_assembly = $highest_cs->version();
my $genome_db_adaptor = $self->db->get_GenomeDBAdaptor;
my $genome_db = $genome_db_adaptor->fetch_by_name_assembly(
$primary_species_name,
$primary_species_assembly);
my $dnafrag_adaptor = $self->db->get_DnaFragAdaptor;
my $dnafrag = $dnafrag_adaptor->fetch_by_GenomeDB_and_name(
$genome_db, $slice->seq_region_name);
next if (!$dnafrag);
my $max_alignment_length = $mlss->max_alignment_length;
my $lower_bound = $slice->start - $max_alignment_length;
my $sql = qq{
SELECT
genomic_align_block_id,
dnafrag_start,
dnafrag_end,
dnafrag_strand,
cigar_line,
length
FROM
genomic_align
LEFT JOIN
genomic_align_block
USING
(genomic_align_block_id)
WHERE
genomic_align.method_link_species_set_id = ?
AND dnafrag_id = ?
AND dnafrag_start <= ?
AND dnafrag_end >= ?
AND dnafrag_start >= ?
};
my $sth = $self->prepare($sql);
$sth->execute($mlss->dbID, $dnafrag->dbID, $slice->end, $slice->start, $lower_bound);
my ($genomic_align_block_id, $dnafrag_start, $dnafrag_end, $dnafrag_strand, $cigar_line, $length);
$sth->bind_columns(\$genomic_align_block_id,\$ dnafrag_start,\$ dnafrag_end,\$ dnafrag_strand,\$ cigar_line,\$ length);
while ($sth->fetch) {
my $light_genomic_align = {
genomic_align_block_id => $genomic_align_block_id,
dnafrag_start => $dnafrag_start,
dnafrag_end => $dnafrag_end,
dnafrag_strand => $dnafrag_strand,
cigar_line => $cigar_line,
length => $length};
push @$light_genomic_aligns, $light_genomic_align;
}
return $light_genomic_aligns;
}
1; } |
sub _print_scores
{ my ($scores, $packed) = @_;
my $num_scores = scalar(@$scores);
my $cnt;
my ($start, $end);
my $i;
my @values;
my $total_scores = 0;
print "num scores $num_scores\n";
for ($cnt = 0; $cnt < $num_scores; $cnt++) {
if ($packed) {
$end = (length($scores->[$cnt]->expected_score) / 4); } else {
@values = split ' ', $scores->[$cnt]->diff_score;
$end = scalar(@values);
}
print "row $cnt length $end\n";
$total_scores += $end;
for ($i = 0; $i < $end; $i++) {
my $score;
if ($packed) {
my $value = substr $scores->[$cnt]->expected_score, $i*$_pack_size, $_pack_size;
$score = unpack($_pack_type, $value);
} else {
$score = $values[$i];
}
print "$i score $score\n ";
}
}
print "Total $total_scores\n"; } |
sub _reverse
{ my ($scores, $genomic_align_block_length) = @_;
foreach my $s (@$scores) {
$s->reverse($genomic_align_block_length);
}
my @rev = reverse @$scores;
return\@ rev; } |
sub _unpack_scores
{ my ($scores) = @_;
if (!defined $scores) {
return "";
}
my $num_scores = length($scores)/$_pack_size;
my $score = "";
for (my $i = 0; $i < $num_scores * $_pack_size; $i+=$_pack_size) {
my $value = substr $scores, $i, $_pack_size;
$score .= unpack($_pack_type, $value) . " ";
}
return $score; } |
sub fetch_all_by_GenomicAlignBlock
{ my ($self, $genomic_align_block, $start, $end, $slice_length,
$display_size, $display_type, $window_size) = @_;
my $scores = [];
if (!defined $display_size) {
$display_size = 700;
}
if (!defined $display_type) {
$display_type = "AVERAGE";
}
if (!defined $start) {
$start = 1;
}
if (!defined $end) {
$end = $genomic_align_block->length;
}
if (!defined $slice_length) {
$slice_length = $genomic_align_block->length;
}
my $align_start = $start;
my $align_end = $end;
if (defined $genomic_align_block->{'restricted_aln_start'}) {
$align_start += $genomic_align_block->{'restricted_aln_start'} - 1;
}
if (defined $genomic_align_block->{'restricted_aln_end'}) {
$align_end = $genomic_align_block->{'restricted_aln_end'} - ($genomic_align_block->length-$align_end);
}
my $bucket_size = ($slice_length)/$display_size;
my @window_sizes = (1, 10, 100, 500);
my $found = 0;
if (defined $window_size) {
foreach my $win_size (@window_sizes) {
if ($win_size == $window_size) {
$found = 1;
last;
}
}
if (!$found) {
warning("Invalid window_size $window_size");
return $scores;
}
}
if (!defined $window_size) {
$window_size = $window_sizes[scalar(@window_sizes)-1];
for (my $i = 1; $i < scalar(@window_sizes); $i++) {
if ($bucket_size < $window_sizes[$i]) {
$window_size = $window_sizes[$i-1];
last;
}
}
}
$_bucket = {diff_score => 0,
start_pos => 0,
end_pos => 0,
start_seq_region_pos => 0,
end_seq_region_pos => 0,
called => 0,
cnt => 0,
size => $bucket_size,
current => 0};
my $reference_genomic_align = $genomic_align_block->reference_genomic_align;
if (!$reference_genomic_align) {
$genomic_align_block->reference_genomic_align($genomic_align_block->get_all_GenomicAligns->[0]);
}
my $gab_id;
if (defined $genomic_align_block->{'dbID'}) {
$gab_id = $genomic_align_block->{'dbID'};
} else {
$gab_id = $genomic_align_block->{'original_dbID'};
}
my $conservation_scores = $self->_fetch_all_by_GenomicAlignBlockId_WindowSize($gab_id, $window_size, $PACKED);
if (scalar(@$conservation_scores) == 0) {
return $scores;
}
if ($genomic_align_block->get_original_strand == 0) {
$conservation_scores = _reverse($conservation_scores);
}
$_score_index = 0;
$scores = $self->_get_alignment_scores($conservation_scores, $align_start,
$align_end, $display_type, $window_size,
$genomic_align_block);
if (scalar(@$scores) == 0) {
return $scores;
}
my $all_scores;
foreach my $this_conservation_score (@$scores) {
$this_conservation_score->position($this_conservation_score->position + $start - 1);
push (@$all_scores, $this_conservation_score);
}
my ($min_y_axis, $max_y_axis) = _find_min_max_score($all_scores);
if ((scalar @$all_scores) > 0) {
$all_scores->[0]->y_axis_min($min_y_axis);
$all_scores->[0]->y_axis_max($max_y_axis);
}
return ($all_scores); } |
sub fetch_all_by_MethodLinkSpeciesSet_Slice
{ my ($self, $method_link_species_set, $slice, $display_size, $display_type, $window_size) = @_;
my $scores = [];
my $key = "gerp_" . $method_link_species_set->dbID;
my $ma_mlss_id = $self->db->get_MetaContainer->list_value_by_key($key);
my $ma_mlss;
if (@$ma_mlss_id) {
$ma_mlss = $self->db->get_MethodLinkSpeciesSet->fetch_by_dbID($ma_mlss_id->[0]);
} else {
return $scores;
}
my $light_genomic_aligns = $self->_get_all_ref_genomic_aligns($ma_mlss, $slice);
if (scalar(@$light_genomic_aligns == 0)) {
return $scores;
}
if (!defined $display_size) {
$display_size = 700;
}
if (!defined $display_type) {
$display_type = "AVERAGE";
}
my $bucket_size = ($slice->end-$slice->start+1)/$display_size;
my @window_sizes = (1, 10, 100, 500);
my $found = 0;
if (defined $window_size) {
foreach my $win_size (@window_sizes) {
if ($win_size == $window_size) {
$found = 1;
last;
}
}
if (!$found) {
warning("Invalid window_size $window_size");
return $scores;
}
}
if (!defined $window_size) {
$window_size = $window_sizes[scalar(@window_sizes)-1];
for (my $i = 1; $i < scalar(@window_sizes); $i++) {
if ($bucket_size < $window_sizes[$i]) {
$window_size = $window_sizes[$i-1];
last;
}
}
}
$_bucket = {diff_score => 0,
start_pos => 0,
end_pos => 0,
start_seq_region_pos => $slice->start,
end_seq_region_pos => $slice->end,
called => 0,
cnt => 0,
size => $bucket_size,
current => 0};
foreach my $light_genomic_align (@$light_genomic_aligns) {
my $genomic_align_block_id = $light_genomic_align->{genomic_align_block_id};
my $cigar_line = $light_genomic_align->{cigar_line};
my $dnafrag_start = $light_genomic_align->{dnafrag_start};
my $dnafrag_end = $light_genomic_align->{dnafrag_end};
my $dnafrag_strand = $light_genomic_align->{dnafrag_strand};
my $gab_length = $light_genomic_align->{length};
my $conservation_scores = $self->_fetch_all_by_GenomicAlignBlockId_WindowSize($genomic_align_block_id, $window_size, $PACKED);
if (scalar(@$conservation_scores) == 0) {
next;
}
$_score_index = 0;
if ($dnafrag_strand == -1) {
$cigar_line = join("", reverse ($cigar_line=~(/(\d*[GDMIX])/g)));
$conservation_scores = _reverse($conservation_scores, $gab_length);
$dnafrag_strand = 1;
}
if ($display_size == ($slice->end - $slice->start + 1)) {
$scores = _get_aligned_scores_from_cigar_line_fast($self, $cigar_line, $dnafrag_start, $dnafrag_end, $slice->start, $slice->end, $conservation_scores, $genomic_align_block_id, $gab_length, $display_type, $window_size, $scores);
} else {
$scores = _get_aligned_scores_from_cigar_line($self, $cigar_line, $dnafrag_start, $dnafrag_end, $slice->start, $slice->end, $conservation_scores, $genomic_align_block_id, $gab_length, $display_type, $window_size, $scores);
}
}
if ($slice->strand == -1) {
@$scores = reverse(@$scores);
my $max_pos = $scores->[0]->position;
foreach my $score (@$scores) {
$score->position($max_pos - $score->position + 1);
}
}
if (scalar(@$scores) == 0) {
return $scores;
}
my $i = 0;
while ($i < scalar(@$scores)) {
if (!defined($_no_score_value) &&
!defined($scores->[$i]->diff_score)) {
splice @$scores, $i, 1;
} elsif (defined($_no_score_value) &&
$scores->[$i]->diff_score == $_no_score_value) {
splice @$scores, $i, 1;
} else {
$i++;
}
}
my ($min_y_axis, $max_y_axis) = _find_min_max_score($scores);
if ((scalar @$scores) > 0) {
$scores->[0]->y_axis_min($min_y_axis);
$scores->[0]->y_axis_max($max_y_axis);
}
return ($scores); } |
fetch_all_by_MethodLinkSpeciesSet_SliceOLD | description | prev | next | Top |
sub fetch_all_by_MethodLinkSpeciesSet_SliceOLD
{ my ($self, $method_link_species_set, $slice, $display_size, $display_type, $window_size) = @_;
my $scores = [];
my $key = "gerp_" . $method_link_species_set->dbID;
my $ma_mlss_id = $self->db->get_MetaContainer->list_value_by_key($key);
my $ma_mlss;
if (@$ma_mlss_id) {
$ma_mlss = $self->db->get_MethodLinkSpeciesSet->fetch_by_dbID($ma_mlss_id->[0]);
} else {
return $scores;
}
my $genomic_align_block_adaptor = $self->db->get_GenomicAlignBlockAdaptor;
my $genomic_align_blocks = $genomic_align_block_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($ma_mlss, $slice);
if (scalar(@$genomic_align_blocks == 0)) {
return $scores;
}
if (!defined $display_size) {
$display_size = 700;
}
if (!defined $display_type) {
$display_type = "AVERAGE";
}
my $bucket_size = ($slice->end-$slice->start+1)/$display_size;
my @window_sizes = (1, 10, 100, 500);
my $found = 0;
if (defined $window_size) {
foreach my $win_size (@window_sizes) {
if ($win_size == $window_size) {
$found = 1;
last;
}
}
if (!$found) {
warning("Invalid window_size $window_size");
return $scores;
}
}
if (!defined $window_size) {
$window_size = $window_sizes[scalar(@window_sizes)-1];
for (my $i = 1; $i < scalar(@window_sizes); $i++) {
if ($bucket_size < $window_sizes[$i]) {
$window_size = $window_sizes[$i-1];
last;
}
}
}
$_bucket = {diff_score => 0,
start_pos => 0,
end_pos => 0,
start_seq_region_pos => $slice->start,
end_seq_region_pos => $slice->end,
called => 0,
cnt => 0,
size => $bucket_size,
current => 0};
foreach my $genomic_align_block (@$genomic_align_blocks) {
my $genomic_align = $genomic_align_block->reference_genomic_align;
my $conservation_scores = $self->_fetch_all_by_GenomicAlignBlockId_WindowSize($genomic_align_block->dbID, $window_size, $PACKED);
if (scalar(@$conservation_scores) == 0) {
next;
}
if ($genomic_align_block->get_original_strand == 0) {
$conservation_scores = _reverse($conservation_scores, $genomic_align_block->length);
}
$_score_index = 0;
if ($display_size == ($slice->end - $slice->start + 1)) {
$scores = _get_aligned_scores_from_cigar_line_fast($self, $genomic_align->cigar_line, $genomic_align->dnafrag_start, $genomic_align->dnafrag_end, $slice->start, $slice->end, $conservation_scores, $genomic_align_block->dbID, $genomic_align_block->length, $display_type, $window_size, $scores);
} else {
$scores = _get_aligned_scores_from_cigar_line($self, $genomic_align->cigar_line, $genomic_align->dnafrag_start, $genomic_align->dnafrag_end, $slice->start, $slice->end, $conservation_scores, $genomic_align_block->dbID, $genomic_align_block->length, $display_type, $window_size, $scores);
}
}
if (scalar(@$scores) == 0) {
return $scores;
}
my $i = 0;
while ($i < scalar(@$scores)) {
if (!defined($_no_score_value) &&
!defined($scores->[$i]->diff_score)) {
splice @$scores, $i, 1;
} elsif (defined($_no_score_value) &&
$scores->[$i]->diff_score == $_no_score_value) {
splice @$scores, $i, 1;
} else {
$i++;
}
}
my ($min_y_axis, $max_y_axis) = _find_min_max_score($scores);
if ((scalar @$scores) > 0) {
$scores->[0]->y_axis_min($min_y_axis);
$scores->[0]->y_axis_max($max_y_axis);
}
return ($scores); } |
sub store
{ my ($self,$cs) = @_;
unless(defined $cs && ref $cs &&
$cs->isa('Bio::EnsEMBL::Compara::ConservationScore') ) {
$self->throw("Must have conservation score arg [$cs]");
}
my $genomic_align_block = $cs->genomic_align_block;
my $window_size = $cs->window_size;
my $position = $cs->{position};
unless($genomic_align_block && $window_size && $position) {
$self->throw("conservation score must have a genomic_align_block, window_size and position");
}
if (!$genomic_align_block->isa("Bio::EnsEMBL::Compara::GenomicAlignBlock")) {
throw("[$genomic_align_block] is not a Bio::EnsEMBL::Compara::GenomicAlignBlock");
}
my $genomic_align_block_id = $genomic_align_block->dbID;
my $exp_packed;
my $diff_packed;
if (!$cs->packed) {
my @exp_scores = split ' ',$cs->expected_score;
my @diff_scores = split ' ',$cs->diff_score;
for (my $i = 0; $i < scalar(@exp_scores); $i++) {
$exp_packed .= pack($_pack_type, $exp_scores[$i]);
$diff_packed .= pack($_pack_type, $diff_scores[$i]);
}
} else {
$exp_packed = $cs->expected_score;
$diff_packed = $cs->diff_score;
}
my $sql = "INSERT into conservation_score (genomic_align_block_id,window_size,position,expected_score, diff_score) ".
" VALUES ('$genomic_align_block_id','$window_size', '$position', ?, ?)";
my $sth = $self->prepare($sql);
$sth->execute($exp_packed, $diff_packed);
$cs->adaptor($self);
}
} |
General documentation
This modules is part of the EnsEMBL project (
)
Questions can be posted to the ensembl-dev mailing list:
ensembl-dev@ebi.ac.uk
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
Arg 1 : string $display_type (either AVERAGE or MAX (plot average or max value))
Arg 2 : float $exp_score (expected score to be added to bucket)
Arg 3 : float $diff_score (difference score to be added to bucket)
Arg 4 : int $chr_pos (position in slice of reference species)
Arg 5 : int $start_slice (start position of slice)
Arg 6 : int $num_buckets (number of buckets used so far)
Arg 7 : int $genomic_align_block_id (genomic_align_block_id of
alignment block)
Arg 8 : int $win_size window size used
Example : $aligned_score = _add_to_bucket($self, "AVERAGE", $exp_score, $diff_score, $chr_pos, $start_slice, scalar(@$aligned_scores), $genomic_align_block_id, $win_size);
Description: Add scores to bucket until it is full (given by size) and then
average the called scores or take the maximum (given by
display_type). Once the bucket is full, create a new
conservation score object
Returntype : ref to Bio::EnsEMBL::Compara::ConservationScore object if the
bucket if full or 0 if it isn't full yet
Exceptions : none
Caller : general
Status : At risk