Bio::Tools
Genscan
Toolbar
Summary
Bio::Tools::Genscan - Results of one Genscan run
Package variables
No package variables defined.
Included modules
Inherit
Synopsis
$genscan = Bio::Tools::Genscan->new(-file => 'result.genscan');
# filehandle:
$genscan = Bio::Tools::Genscan->new( -fh => \*INPUT );
# parse the results
# note: this class is-a Bio::Tools::AnalysisResult which implements
# Bio::SeqAnalysisParserI, i.e., $genscan->next_feature() is the same
while($gene = $genscan->next_prediction()) {
# $gene is an instance of Bio::Tools::Prediction::Gene, which inherits
# off Bio::SeqFeature::Gene::Transcript.
#
# $gene->exons() returns an array of
# Bio::Tools::Prediction::Exon objects
# all exons:
@exon_arr = $gene->exons();
# initial exons only
@init_exons = $gene->exons('Initial');
# internal exons only
@intrl_exons = $gene->exons('Internal');
# terminal exons only
@term_exons = $gene->exons('Terminal');
# singleton exons:
($single_exon) = $gene->exons();
}
# essential if you gave a filename at initialization (otherwise the file
# will stay open)
$genscan->close();
Description
The Genscan module provides a parser for Genscan gene structure prediction
output. It parses one gene prediction into a Bio::SeqFeature::Gene::Transcript-
derived object.
This module also implements the Bio::SeqAnalysisParserI interface, and thus
can be used wherever such an object fits. See
Bio::SeqAnalysisParserI.
Methods
Methods description
Title : _add_prediction() Usage : $obj->_add_prediction($gene) Function: internal Example : Returns : |
Title : _has_cds() Usage : $obj->_has_cds() Function: Whether or not the result contains the predicted CDSs, too. Example : Returns : TRUE or FALSE |
Title : _parse_predictions() Usage : $obj->_parse_predictions() Function: Parses the prediction section. Automatically called by next_prediction() if not yet done. Example : Returns : |
Title : _prediction() Usage : $gene = $obj->_prediction() Function: internal Example : Returns : |
Title : _predictions_parsed Usage : $obj->_predictions_parsed Function: internal Example : Returns : TRUE or FALSE |
Title : _read_fasta_seq() Usage : ($id,$seqstr) = $obj->_read_fasta_seq(); Function: Simple but specialised FASTA format sequence reader. Uses $self->_readline() to retrieve input, and is able to strip off the traling description lines. Example : Returns : An array of two elements. |
Usage : $genscan->analysis_method(); Purpose : Inherited method. Overridden to ensure that the name matches /genscan/i. Returns : String Argument : n/a |
Title : next_feature Usage : while($gene = $genscan->next_feature()) { # do something } Function: Returns the next gene structure prediction of the Genscan result file. Call this method repeatedly until FALSE is returned.
The returned object is actually a SeqFeatureI implementing object.
This method is required for classes implementing the
SeqAnalysisParserI interface, and is merely an alias for
next_prediction() at present.
Example :
Returns : A Bio::Tools::Prediction::Gene object.
Args : |
Title : next_prediction Usage : while($gene = $genscan->next_prediction()) { # do something } Function: Returns the next gene structure prediction of the Genscan result file. Call this method repeatedly until FALSE is returned.
Example :
Returns : A Bio::Tools::Prediction::Gene object.
Args : |
Methods code
sub _add_prediction
{ my ($self, $gene) = @_;
if(! exists($self->{'_preds'})) {
$self->{'_preds'} = [];
}
push(@{$self->{'_preds'}}, $gene); } |
sub _has_cds
{ my ($self, $val) = @_;
$self->{'_has_cds'} = $val if $val;
if(! exists($self->{'_has_cds'})) {
$self->{'_has_cds'} = 0;
}
return $self->{'_has_cds'}; } |
sub _initialize_state
{ my ($self,@args) = @_;
$self->SUPER::_initialize_state(@args);
$self->{'_preds_parsed'} = 0;
$self->{'_has_cds'} = 0;
$self->{'_preds'} = [];
$self->{'_seqstack'} = []; } |
sub _parse_predictions
{ my ($self) = @_;
my %exontags = ('Init' => 'Initial',
'Intr' => 'Internal',
'Term' => 'Terminal',
'Sngl' => '');
my $gene;
my $seqname;
while(defined($_ = $self->_readline())) {
if(/^\s*(\d+)\.(\d+)/) {
my $prednr = $1;
my $signalnr = $2; if(! defined($gene)) {
$gene = Bio::Tools::Prediction::Gene->new(
'-primary' => "GenePrediction$prednr",
'-source' => 'Genscan');
}
chomp();
my @flds = split(' ', $_);
my $predobj;
my $is_exon = grep {$_ eq $flds[1];} (keys(%exontags));
if($is_exon) {
$predobj = Bio::Tools::Prediction::Exon->new();
} else {
$predobj = Bio::SeqFeature::Generic->new();
}
$predobj->source_tag('Genscan');
$predobj->score($flds[$#flds]);
$predobj->strand((($flds[2] eq '+') ? 1 : -1));
my ($start, $end) = @flds[(3,4)];
if($predobj->strand() == 1) {
$predobj->start($start);
$predobj->end($end);
} else {
$predobj->end($start);
$predobj->start($end);
}
if($is_exon) {
$predobj->start_signal_score($flds[8]);
$predobj->end_signal_score($flds[9]);
$predobj->coding_signal_score($flds[10]);
$predobj->significance($flds[11]);
$predobj->primary_tag($exontags{$flds[1]} . 'Exon');
$predobj->is_coding(1);
my $cod_offset;
if($predobj->strand() == 1) {
$cod_offset = $flds[6] - (($predobj->start()-1) % 3);
$cod_offset += 3 if($cod_offset < 1);
} else {
$cod_offset = $flds[6] - (($predobj->end()-3) % 3);
$cod_offset -= 3 if($cod_offset >= 0);
$cod_offset = -$cod_offset;
}
$predobj->frame(3 - $cod_offset);
$gene->add_exon($predobj, $exontags{$flds[1]});
} elsif($flds[1] eq 'PlyA') {
$predobj->primary_tag("PolyAsite");
$gene->poly_A_site($predobj);
} elsif($flds[1] eq 'Prom') {
$predobj->primary_tag("Promoter");
$gene->add_promoter($predobj);
}
next;
}
if(/^\s*$/ && defined($gene)) {
$gene->seq_id($seqname);
$self->_add_prediction($gene);
$gene = undef;
next;
}
if(/^(GENSCAN)\s+(\S+)/) {
$self->analysis_method($1);
$self->analysis_method_version($2);
next;
}
if(/^Sequence\s+(\S+)\s*:/) {
$seqname = $1;
next;
}
if(/^Parameter matrix:\s+(\S+)/i) {
$self->analysis_subject($1);
next;
}
if(/^Predicted coding/) {
$self->_has_cds(1);
next;
}
/^>/ && do {
$self->_pushback($_);
last;
};
}
$self->_predictions_parsed(1); } |
sub _prediction
{ my ($self) = @_;
return undef unless(exists($self->{'_preds'}) && @{$self->{'_preds'}});
return shift(@{$self->{'_preds'}}); } |
sub _predictions_parsed
{ my ($self, $val) = @_;
$self->{'_preds_parsed'} = $val if $val;
if(! exists($self->{'_preds_parsed'})) {
$self->{'_preds_parsed'} = 0;
}
return $self->{'_preds_parsed'}; } |
sub _read_fasta_seq
{ my ($self) = @_;
my ($id, $seq);
local $/ = ">";
my $entry = $self->_readline();
if($entry) {
$entry =~ s/^>//;
while($entry !~ />$/) {
last unless $_ = $self->_readline();
$entry .= $_;
}
$entry =~ s/\n\n.*$//s;
if($entry =~ /^(\S+)\n([^>]+)/) {
$id = $1;
$seq = $2;
} else {
$self->throw("Can't parse Genscan predicted sequence entry");
}
$seq =~ s/\s//g; }
return ($id, $seq);
}
1; } |
sub analysis_method
{ my ($self, $method) = @_;
if($method && ($method !~ /genscan/i)) {
$self->throw("method $method not supported in " . ref($self));
}
return $self->SUPER::analysis_method($method); } |
sub next_feature
{ my ($self,@args) = @_;
return $self->next_prediction(@args); } |
sub next_prediction
{ my ($self) = @_;
my $gene;
$self->_parse_predictions() unless $self->_predictions_parsed();
$gene = $self->_prediction();
if($gene) {
my ($id, $seq);
my $seqobj = pop(@{$self->{'_seqstack'}});
if(! $seqobj) {
($id, $seq) = $self->_read_fasta_seq();
if($id && $seq) {
$seqobj = Bio::PrimarySeq->new('-seq' => $seq,
'-display_id' => $id,
'-alphabet' => "protein");
}
}
if($seqobj) {
$gene->primary_tag() =~ /[^0-9]([0-9]+)$/;
my $prednr = $1;
if($seqobj->display_id() !~ /_predicted_\w+_$prednr\|/) {
push(@{$self->{'_seqstack'}}, $seqobj);
} else {
$gene->predicted_protein($seqobj);
if($self->_has_cds()) {
($id, $seq) = $self->_read_fasta_seq();
$seqobj = Bio::PrimarySeq->new('-seq' => $seq,
'-display_id' => $id,
'-alphabet' => "dna");
$gene->predicted_cds($seqobj);
}
}
}
}
return $gene; } |
General documentation
User feedback is an integral part of the evolution of this and other
Bioperl modules. Send your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bio.perl.org/MailList.html - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track
the bugs and their resolution. Bug reports can be submitted via email
or the web:
bioperl-bugs@bio.perl.org
http://bugzilla.bioperl.org/
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _