Bio::Index
AbstractSeq
Toolbar
Summary
Bio::Index::AbstractSeq - Base class for AbstractSeq s
Package variables
No package variables defined.
Included modules
Inherit
Synopsis
# Make a new sequence file indexing package
package MyShinyNewIndexer;
use Bio::Index::AbstractSeq;
@ISA = ('Bio::Index::AbstractSeq');
# Now provide the necessary methods...
Description
Provides a common base class for multiple
sequence files built using the
Bio::Index::Abstract system, and provides a
Bio::DB::SeqI interface.
Methods
Methods description
Title : _file_format Usage : $self->_file_format Function: Derived classes should override this method (it throws an exception here) to give the file format of the files used Example : Returns : Args : |
Title : _get_SeqIO_object Usage : $index->_get_SeqIO_object( $file ) Function: Returns a Bio::SeqIO object for the file Example : $seq = $index->_get_SeqIO_object( 0 ) Returns : Bio::SeqIO object Args : File number (an integer) |
Title : fetch Usage : $index->fetch( $id ) Function: Returns a Bio::Seq object from the index Example : $seq = $index->fetch( 'dJ67B12' ) Returns : Bio::Seq object Args : ID |
Title : get_PrimarySeq_stream Usage : $stream = get_PrimarySeq_stream Function: Makes a Bio::DB::SeqStreamI compliant object which provides a single method, next_primary_seq Returns : Bio::DB::SeqStreamI Args : none |
Title : get_Seq_by_acc Usage : $seq = $db->get_Seq_by_acc() Function: retrieves a sequence object, identically to ->fetch, but here behaving as a Bio::DB::BioSeqI Returns : new Bio::Seq object Args : string represents the accession number |
Title : get_Seq_by_id Usage : $seq = $db->get_Seq_by_id() Function: retrieves a sequence object, identically to ->fetch, but here behaving as a Bio::DB::BioSeqI Returns : new Bio::Seq object Args : string represents the id |
Title : get_Seq_by_primary_id Usage : $seq = $db->get_Seq_by_primary_id($primary_id_string); Function: Gets a Bio::Seq object by the primary id. The primary id in these cases has to come from $db->get_all_primary_ids. There is no other way to get (or guess) the primary_ids in a database.
The other possibility is to get Bio::PrimarySeqI objects
via the get_PrimarySeq_stream and the primary_id field
on these objects are specified as the ids to use here.
Returns : A Bio::Seq object
Args : primary id (as a string)
Throws : "acc does not exist" exception |
Title : get_all_primary_ids Usage : @ids = $seqdb->get_all_primary_ids() Function: gives an array of all the primary_ids of the sequence objects in the database. These maybe ids (display style) or accession numbers or something else completely different - they *are not* meaningful outside of this database implementation. Example : Returns : an array of strings Args : none |
Methods code
sub _file_format
{ my ($self,@args) = @_;
my $pkg = ref($self);
$self->throw("Class '$pkg' must provide a file format method correctly"); } |
sub _get_SeqIO_object
{ my( $self, $i ) = @_;
unless ($self->{'_seqio_cache'}[$i]) {
my $fh = $self->_file_handle($i);
my $seqio = Bio::SeqIO->new( -Format => $self->_file_format,
-fh => $fh);
$self->{'_seqio_cache'}[$i] = $seqio;
}
return $self->{'_seqio_cache'}[$i]; } |
sub fetch
{ my( $self, $id ) = @_;
my $db = $self->db();
my $seq;
if (my $rec = $db->{ $id }) {
my ($file, $begin) = $self->unpack_record( $rec );
my $seqio = $self->_get_SeqIO_object( $file );
my $fh = $seqio->_fh();
$begin-- if( $^O =~ /mswin/i); seek($fh, $begin, 0);
$seq = $seqio->next_seq();
}
$seq->primary_id($seq->display_id()) if( defined $seq && ref($seq) &&
$seq->isa('Bio::PrimarySeqI') );
return $seq; } |
sub get_PrimarySeq_stream
{ my $self = shift;
my $num = $self->_file_count() || 0;
my @file;
for (my $i = 0; $i < $num; $i++) {
my( $file, $stored_size ) = $self->unpack_record( $self->db->{"__FILE_$i"} );
push(@file,$file);
}
my $out = Bio::SeqIO::MultiFile->new( '-format' => $self->_file_format , -files =>\@ file);
return $out; } |
sub get_Seq_by_acc
{ my ($self,$id) = @_;
return $self->fetch($id); } |
sub get_Seq_by_id
{ my ($self,$id) = @_;
return $self->fetch($id); } |
sub get_Seq_by_primary_id
{ my ($self,$id) = @_;
return $self->fetch($id);
}
1; } |
sub get_all_primary_ids
{ my ($self,@args) = @_;
my $db = $self->db;
my( %bytepos );
while (my($id, $rec) = each %$db) {
if( $id =~ /^__/ ) {
next;
}
my ($file, $begin) = $self->unpack_record( $rec );
$bytepos{"$file:$begin"} = $id;
}
return values %bytepos; } |
sub new
{ my ($class, @args) = @_;
my $self = $class->SUPER::new(@args);
$self->{'_seqio_cache'} = [];
return $self; } |
General documentation
User feedback is an integral part of the evolution of this
and other Bioperl modules. Send your comments and suggestions preferably
to one of the Bioperl mailing lists.
Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/MailList.shtml - About the mailing lists
Report bugs to the Bioperl bug tracking system to help us keep track
the bugs and their resolution.
Bug reports can be submitted via email or the web:
bioperl-bugs@bio.perl.org
http://bugzilla.bioperl.org/
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
Bio::Index::Abstract - Module which
Bio::Index::AbstractSeq inherits off, which
provides dbm indexing for flat files (which are
not necessarily sequence files).