Bio::EnsEMBL::Analysis::RunnableDB Pseudogene_DB
SummaryIncluded librariesPackage variablesSynopsisDescriptionGeneral documentationMethods
Toolbar
WebCvsRaw content
Summary
Bio::EnsEMBL::Analysis::RunnableDB::Pseudogene_DB.pm
Package variables
No package variables defined.
Included modules
Bio::EnsEMBL::Analysis::Config::Databases qw ( DATABASES DNA_DBNAME )
Bio::EnsEMBL::Analysis::Config::Pseudogene
Bio::EnsEMBL::Analysis::Runnable::Pseudogene
Bio::EnsEMBL::Analysis::RunnableDB::BaseGeneBuild
Bio::EnsEMBL::DBSQL::DBConnection
Bio::EnsEMBL::Pipeline::DBSQL::FlagAdaptor
Bio::EnsEMBL::Pipeline::Flag
Bio::EnsEMBL::Utils::Argument qw ( rearrange )
Bio::EnsEMBL::Utils::Exception qw ( verbose throw warning stack_trace )
Inherit
Bio::EnsEMBL::Analysis::RunnableDB::BaseGeneBuild
Synopsis
my $runnabledb = Bio::EnsEMBL::Analysis::RunnableDB::Pseudogene_DB->new(
-db => $db_adaptor,
-input_id => $slice_id,
-analysis => $analysis,
);
$runnabledb->fetch_input();
$runnabledb->run();
my @array = @{$runnabledb->output};
$runnabledb->write_output();
Description
This object wraps Bio::EnsEMBL::Analysis::Runnable::Pseudogene.pm
Opens connections to 2 dbs:
1 for repeat sequences (GB_DB)
1 for fetching genes from (GB_FINAL)
fetches all the genes on the slice, all the repeats associtaed with each gene and
collects alignment feature evidence for single exon genes and passes them to the
runnable.
This module forms the base class of the pseudogene analysis for the gene build,
it will identify obvious pseudogenes but will also flag genes that look
interesting for analysis by either:
Bio::EnsEMBL::Analysis::RunnableDB::Spliced_elsewhere or
Bio::EnsEMBL::Analysis::RunnableDB::PSILC
PSILC will work on any gene with a PFAM domain and will attempt to predict if
it is a pseudogene - it underpredicts pseudogenes but is useful for sequences
that look dodgy but have no overt evidence of being pseudogenes.
Spliced_elsewhere tests for retrotransposition and tends to be run over single
exon genes.
Configuration for all three of these modules is here:
zBio::EnsEMBL::Analysis::Config::Pseudogene
Methods
BLESSED_BIOTYPES
No description
Code
DEBUG
No description
Code
INDETERMINATE
No description
Code
ORTH1
No description
Code
ORTH2
No description
Code
PSILC_BLAST_DB
No description
Code
PSILC_CHUNK
No description
Code
PSILC_LOGIC_NAME
No description
Code
PSILC_ORTH1_DBHOST
No description
Code
PSILC_ORTH1_DBNAME
No description
Code
PSILC_ORTH1_DBPORT
No description
Code
PSILC_ORTH2_DBHOST
No description
Code
PSILC_ORTH2_DBNAME
No description
Code
PSILC_ORTH2_DBPORT
No description
Code
PSILC_SUBJECT_DBHOST
No description
Code
PSILC_SUBJECT_DBNAME
No description
Code
PSILC_SUBJECT_DBPORT
No description
Code
PSILC_WORK_DIR
No description
Code
PS_ALIGNED_GENOMIC
No description
Code
PS_BIOTYPE
No description
Code
PS_CHUNK
No description
Code
PS_FRAMESHIFT_INTRON_LENGTH
No description
Code
PS_INPUT_DATABASE
No description
Code
PS_MAX_EXON_COVERAGE
No description
Code
PS_MAX_INTRON_COVERAGE
No description
Code
PS_MAX_INTRON_LENGTH
No description
Code
PS_MIN_EXONS
No description
Code
PS_MULTI_EXON_DIR
No description
Code
PS_NUM_FRAMESHIFT_INTRONS
No description
Code
PS_NUM_REAL_INTRONS
No description
Code
PS_OUTPUT_DATABASE
No description
Code
PS_PERCENT_ID_CUTOFF
No description
Code
PS_PSEUDO_TYPE
No description
Code
PS_P_VALUE_CUTOFF
No description
Code
PS_REPEAT_TYPE(1)
No description
Code
PS_REPEAT_TYPE(2)
No description
Code
PS_REPEAT_TYPES
No description
Code
PS_RETOTRANSPOSED_COVERAGE
No description
Code
PS_SPAN_RATIO
No description
Code
PS_SPECIES_LIMIT
No description
Code
PS_WRITE_IGNORED_GENES
No description
Code
REP_TRANSCRIPT
No description
Code
RETROTRANSPOSED
No description
Code
RETRO_TYPE
No description
Code
SINGLE_EXON
No description
Code
SPLICED_ELSEWHERE_LOGIC_NAME
No description
Code
SUBJECT
No description
Code
_remove_transcript_from_gene
No description
Code
fetch_inputDescriptionCode
gene_dbDescriptionCode
genesDescriptionCode
get_all_repeat_blocksDescriptionCode
ignored_genesDescriptionCode
lazy_loadDescriptionCode
make_runnable
No description
Code
new
No description
Code
pseudo_genesDescriptionCode
real_genesDescriptionCode
rep_dbDescriptionCode
repeat_blocksDescriptionCode
runDescriptionCode
store_idsDescriptionCode
transcript_to_keepDescriptionCode
write_outputDescriptionCode
Methods description
fetch_inputcode    nextTop
  Title   :   fetch_input
Usage : $self->fetch_input
Function: Fetches input data for Pseudogene.pm from the database
Returns : none
Args : none
gene_dbcodeprevnextTop
  Arg [1]    : Bio::EnsEMBL::DBSQL::DBAdaptor
Description: get/set gene db adaptor
Returntype : Bio::EnsEMBL::DBSQL::DBAdaptor
Exceptions : none
Caller : general
genescodeprevnextTop
  Arg [1]    : array ref
Description: get/set genescript set to run over
Returntype : array ref to Bio::EnsEMBL::Gene objects
Exceptions : throws if not a Bio::EnsEMBL::Gene
Caller : general
get_all_repeat_blockscodeprevnextTop
  Args       : none
Description: merges repeats into blocks for each gene
Returntype : array of Seq_Feature blocks;
ignored_genescodeprevnextTop
Arg [1] : Bio::EnsEMBL::Gene
Description: get/set for genes that pseudogene does not check
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller : general
lazy_loadcodeprevnextTop
  Arg [1]    : Bio::EnsEMBL::Gene
Description: forces lazy loading of transcripts etc.s
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller : general
pseudo_genescodeprevnextTop
  Arg [1]    : Bio::EnsEMBL::Gene
Description: get/set for pseudogenes
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller : general
real_genescodeprevnextTop
Arg [1] : Bio::EnsEMBL::Gene
Description: get/set for 'functional' genes
Returntype : Bio::EnsEMBL::Gene
Exceptions : none
Caller : general
rep_dbcodeprevnextTop
  Arg [1]    : Bio::EnsEMBL::DBSQL::DBAdaptor
Description: get/set gene db adaptor
Returntype : Bio::EnsEMBL::DBSQL::DBAdaptor
Exceptions : none
Caller : general
repeat_blockscodeprevnextTop
  Arg [1]    : array ref
Description: get/set genescript set to run over
runcodeprevnextTop
  Args       : none
Description: overrides runnableDb run method to allow gene objects to be validated
before runnning the runnable
Returntype : scalar
store_idscodeprevnextTop
Arg [none] :
Description: stores gene dbIDS of genes being held back for further analyses
Returntype : scalar
Exceptions : throws if it cannot open the file
Caller : general
transcript_to_keepcodeprevnextTop
  Args       : Bio::EnsEMBL::Transcript object
Description: removes the translation provided it is not a blessed transcript
Returntype : scalar
write_outputcodeprevnextTop
  Args       : none
description: writes gene array into db specified in Bio::EnsEMBL::Config::GeneBuild::Databases.pm
exception : warns if it cannot write gene
Returntype : none
Methods code
BLESSED_BIOTYPESdescriptionprevnextTop
sub BLESSED_BIOTYPES {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'BLESSED_BIOTYPES'} = $arg;
  }
  return $self->{'BLESSED_BIOTYPES'};
}

1;
}
DEBUGdescriptionprevnextTop
sub DEBUG {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'DEBUG'} = $arg;
  }
  return $self->{'DEBUG'};
}
INDETERMINATEdescriptionprevnextTop
sub INDETERMINATE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'INDETERMINATE'} = $arg;
  }
  return $self->{'INDETERMINATE'};
}
ORTH1descriptionprevnextTop
sub ORTH1 {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'ORTH1'} = $arg;
  }
  return $self->{'ORTH1'};
}
ORTH2descriptionprevnextTop
sub ORTH2 {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'ORTH2'} = $arg;
  }
  return $self->{'ORTH2'};
}
PSILC_BLAST_DBdescriptionprevnextTop
sub PSILC_BLAST_DB {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_BLAST_DB'} = $arg;
  }
  return $self->{'PSILC_BLAST_DB'};
}
PSILC_CHUNKdescriptionprevnextTop
sub PSILC_CHUNK {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_CHUNK'} = $arg;
  }
  return $self->{'PSILC_CHUNK'};
}
PSILC_LOGIC_NAMEdescriptionprevnextTop
sub PSILC_LOGIC_NAME {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_LOGIC_NAME'} = $arg;
  }
  return $self->{'PSILC_LOGIC_NAME'};
}
PSILC_ORTH1_DBHOSTdescriptionprevnextTop
sub PSILC_ORTH1_DBHOST {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_ORTH1_DBHOST'} = $arg;
  }
  return $self->{'PSILC_ORTH1_DBHOST'};
}
PSILC_ORTH1_DBNAMEdescriptionprevnextTop
sub PSILC_ORTH1_DBNAME {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_ORTH1_DBNAME'} = $arg;
  }
  return $self->{'PSILC_ORTH1_DBNAME'};
}
PSILC_ORTH1_DBPORTdescriptionprevnextTop
sub PSILC_ORTH1_DBPORT {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_ORTH1_DBPORT'} = $arg;
  }
  return $self->{'PSILC_ORTH1_DBPORT'};
}
PSILC_ORTH2_DBHOSTdescriptionprevnextTop
sub PSILC_ORTH2_DBHOST {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_ORTH2_DBHOST'} = $arg;
  }
  return $self->{'PSILC_ORTH2_DBHOST'};
}
PSILC_ORTH2_DBNAMEdescriptionprevnextTop
sub PSILC_ORTH2_DBNAME {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_ORTH2_DBNAME'} = $arg;
  }
  return $self->{'PSILC_ORTH2_DBNAME'};
}
PSILC_ORTH2_DBPORTdescriptionprevnextTop
sub PSILC_ORTH2_DBPORT {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_ORTH2_DBPORT'} = $arg;
  }
  return $self->{'PSILC_ORTH2_DBPORT'};
}
PSILC_SUBJECT_DBHOSTdescriptionprevnextTop
sub PSILC_SUBJECT_DBHOST {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_SUBJECT_DBHOST'} = $arg;
  }
  return $self->{'PSILC_SUBJECT_DBHOST'};
}
PSILC_SUBJECT_DBNAMEdescriptionprevnextTop
sub PSILC_SUBJECT_DBNAME {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_SUBJECT_DBNAME'} = $arg;
  }
  return $self->{'PSILC_SUBJECT_DBNAME'};
}
PSILC_SUBJECT_DBPORTdescriptionprevnextTop
sub PSILC_SUBJECT_DBPORT {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_SUBJECT_DBPORT'} = $arg;
  }
  return $self->{'PSILC_SUBJECT_DBPORT'};
}
PSILC_WORK_DIRdescriptionprevnextTop
sub PSILC_WORK_DIR {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PSILC_WORK_DIR'} = $arg;
  }
  return $self->{'PSILC_WORK_DIR'};
}
PS_ALIGNED_GENOMICdescriptionprevnextTop
sub PS_ALIGNED_GENOMIC {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_ALIGNED_GENOMIC'} = $arg;
  }
  return $self->{'PS_ALIGNED_GENOMIC'};
}
PS_BIOTYPEdescriptionprevnextTop
sub PS_BIOTYPE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_BIOTYPE'} = $arg;
  }
  return $self->{'PS_BIOTYPE'};
}
PS_CHUNKdescriptionprevnextTop
sub PS_CHUNK {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_CHUNK'} = $arg;
  }
  return $self->{'PS_CHUNK'};
}
PS_FRAMESHIFT_INTRON_LENGTHdescriptionprevnextTop
sub PS_FRAMESHIFT_INTRON_LENGTH {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_FRAMESHIFT_INTRON_LENGTH'} = $arg;
  }
  return $self->{'PS_FRAMESHIFT_INTRON_LENGTH'};
}
PS_INPUT_DATABASEdescriptionprevnextTop
sub PS_INPUT_DATABASE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_INPUT_DATABASE'} = $arg;
  }
  return $self->{'PS_INPUT_DATABASE'};
}
PS_MAX_EXON_COVERAGEdescriptionprevnextTop
sub PS_MAX_EXON_COVERAGE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_MAX_EXON_COVERAGE'} = $arg;
  }
  return $self->{'PS_MAX_EXON_COVERAGE'};
}
PS_MAX_INTRON_COVERAGEdescriptionprevnextTop
sub PS_MAX_INTRON_COVERAGE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_MAX_INTRON_COVERAGE'} = $arg;
  }
  return $self->{'PS_MAX_INTRON_COVERAGE'};
}
PS_MAX_INTRON_LENGTHdescriptionprevnextTop
sub PS_MAX_INTRON_LENGTH {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_MAX_INTRON_LENGTH'} = $arg;
  }
  return $self->{'PS_MAX_INTRON_LENGTH'};
}
PS_MIN_EXONSdescriptionprevnextTop
sub PS_MIN_EXONS {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_MIN_EXONS'} = $arg;
  }
  return $self->{'PS_MIN_EXONS'};
}
PS_MULTI_EXON_DIRdescriptionprevnextTop
sub PS_MULTI_EXON_DIR {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_MULTI_EXON_DIR'} = $arg;
  }
  return $self->{'PS_MULTI_EXON_DIR'};
}
PS_NUM_FRAMESHIFT_INTRONSdescriptionprevnextTop
sub PS_NUM_FRAMESHIFT_INTRONS {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_NUM_FRAMESHIFT_INTRONS'} = $arg;
  }
  return $self->{'PS_NUM_FRAMESHIFT_INTRONS'};
}
PS_NUM_REAL_INTRONSdescriptionprevnextTop
sub PS_NUM_REAL_INTRONS {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_NUM_REAL_INTRONS'} = $arg;
  }
  return $self->{'PS_NUM_REAL_INTRONS'};
}
PS_OUTPUT_DATABASEdescriptionprevnextTop
sub PS_OUTPUT_DATABASE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_OUTPUT_DATABASE'} = $arg;
  }
  return $self->{'PS_OUTPUT_DATABASE'};
}
PS_PERCENT_ID_CUTOFFdescriptionprevnextTop
sub PS_PERCENT_ID_CUTOFF {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_PERCENT_ID_CUTOFF'} = $arg;
  }
  return $self->{'PS_PERCENT_ID_CUTOFF'};
}
PS_PSEUDO_TYPEdescriptionprevnextTop
sub PS_PSEUDO_TYPE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_PSEUDO_TYPE'} = $arg;
  }
  return $self->{'PS_PSEUDO_TYPE'};
}
PS_P_VALUE_CUTOFFdescriptionprevnextTop
sub PS_P_VALUE_CUTOFF {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_P_VALUE_CUTOFF'} = $arg;
  }
  return $self->{'PS_P_VALUE_CUTOFF'};
}
PS_REPEAT_TYPE(1)descriptionprevnextTop
sub PS_REPEAT_TYPE(1) {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_REPEAT_TYPE'} = $arg;
  }
  return $self->{'PS_REPEAT_TYPE'};
}
PS_REPEAT_TYPE(2)descriptionprevnextTop
sub PS_REPEAT_TYPE(2) {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_REPEAT_TYPE'} = $arg;
  }
  return $self->{'PS_REPEAT_TYPE'};
}
PS_REPEAT_TYPESdescriptionprevnextTop
sub PS_REPEAT_TYPES {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_REPEAT_TYPES'} = $arg;
  }
  return $self->{'PS_REPEAT_TYPES'};
}
PS_RETOTRANSPOSED_COVERAGEdescriptionprevnextTop
sub PS_RETOTRANSPOSED_COVERAGE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_RETOTRANSPOSED_COVERAGE'} = $arg;
  }
  return $self->{'PS_RETOTRANSPOSED_COVERAGE'};
}
PS_SPAN_RATIOdescriptionprevnextTop
sub PS_SPAN_RATIO {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_SPAN_RATIO'} = $arg;
  }
  return $self->{'PS_SPAN_RATIO'};
}
PS_SPECIES_LIMITdescriptionprevnextTop
sub PS_SPECIES_LIMIT {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_SPECIES_LIMIT'} = $arg;
  }
  return $self->{'PS_SPECIES_LIMIT'};
}
PS_WRITE_IGNORED_GENESdescriptionprevnextTop
sub PS_WRITE_IGNORED_GENES {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'PS_WRITE_IGNORED_GENES'} = $arg;
  }
  return $self->{'PS_WRITE_IGNORED_GENES'};
}
REP_TRANSCRIPTdescriptionprevnextTop
sub REP_TRANSCRIPT {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'REP_TRANSCRIPT'} = $arg;
  }
  return $self->{'REP_TRANSCRIPT'};
}
RETROTRANSPOSEDdescriptionprevnextTop
sub RETROTRANSPOSED {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'RETROTRANSPOSED'} = $arg;
  }
  return $self->{'RETROTRANSPOSED'};
}
RETRO_TYPEdescriptionprevnextTop
sub RETRO_TYPE {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'RETRO_TYPE'} = $arg;
  }
  return $self->{'RETRO_TYPE'};
}
SINGLE_EXONdescriptionprevnextTop
sub SINGLE_EXON {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'SINGLE_EXON'} = $arg;
  }
  return $self->{'SINGLE_EXON'};
}
SPLICED_ELSEWHERE_LOGIC_NAMEdescriptionprevnextTop
sub SPLICED_ELSEWHERE_LOGIC_NAME {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'SPLICED_ELSEWHERE_LOGIC_NAME'} = $arg;
  }
  return $self->{'SPLICED_ELSEWHERE_LOGIC_NAME'};
}
SUBJECTdescriptionprevnextTop
sub SUBJECT {
  my ($self, $arg) = @_;
  if($arg){
    $self->{'SUBJECT'} = $arg;
  }
  return $self->{'SUBJECT'};
}
_remove_transcript_from_genedescriptionprevnextTop
sub _remove_transcript_from_gene {
  my ($self, $gene, $trans_to_del)  = @_;
  # check to see if it is a blessed transcript first
return 'BLESSED' if $self->BLESSED_BIOTYPES->{$trans_to_del->biotype}; my @newtrans; foreach my $trans (@{$gene->get_all_Transcripts}) { if ($trans != $trans_to_del) { push @newtrans,$trans; } } # The naughty bit!
$gene->{_transcript_array} = []; foreach my $trans (@newtrans) { $gene->add_Transcript($trans); } return;
}
fetch_inputdescriptionprevnextTop
sub fetch_input {
  my( $self) = @_;
  throw("No input id") unless defined($self->input_id);

  my $results = [];		# array ref to store the output
my %repeat_blocks; my %homolog_hash; my @transferred_genes; print "Loading reference database : REFERENCE_DB.\n"; my $rep_db = $self->get_dbadaptor("REFERENCE_DB") ; #store repeat db internally
$self->rep_db($rep_db); my $rsa = $rep_db->get_SliceAdaptor; print "Loading genes database : PS_INPUT_DATABASE => ". $self->PS_INPUT_DATABASE." ( defined in Databases.pm )\n "; my $genes_db = $self->get_dbadaptor($self->PS_INPUT_DATABASE) ; $self->gene_db($genes_db); #genes are written to the pseudogene database
# genes_slice holds all genes on input id slice
my $genedb_sa = $self->gene_db->get_SliceAdaptor; print "DB NAME: ".$self->db->dbc->dbname."\n"; my $genes_slice = $genedb_sa->fetch_by_name($self->input_id); $self->query($genes_slice); my $genes = $genes_slice->get_all_Genes; print $genes_slice->name."\t". $genes_slice->start."\n"; GENE: foreach my $gene (@{$genes}) { # Ignore all other biotypes of genes that are not protein_coding
# these genes will still be written to PS_OUTPUT_DATABASE - unless you set
# PS_DO_NOT_WRITE_IGNORED_GENES = 0
#
unless ( $gene->biotype eq $self->PS_BIOTYPE ) { $self->ignored_genes($gene); next GENE; } ############################################################################
# transfer gene coordinates to entire chromosome to prevent problems arising
# due to offset with repeat features
my $chromosome_slice = $rsa->fetch_by_region( 'toplevel', $genes_slice->chr_name, ); my $transferred_gene = $gene->transfer($chromosome_slice); $self->lazy_load($transferred_gene); push @transferred_genes,$transferred_gene; # repeats come from core database
# repeat slice only covers gene to avoid sorting repeats unnecessarily
my $rep_gene_slice = $rsa->fetch_by_region( 'toplevel', $genes_slice->chr_name, $transferred_gene->start, $transferred_gene->end, ); # get repeat blocks
my @feats = @{$rep_gene_slice->get_all_RepeatFeatures}; @feats = map { $_->transfer($chromosome_slice) } @feats; my $blocks = $self->get_all_repeat_blocks(\@feats); # make hash of repeat blocks using the gene as the key
$repeat_blocks{$transferred_gene} = $blocks; } $self->genes(\@transferred_genes); $self->repeat_blocks(\%repeat_blocks); $self->make_runnable; return 1;
}
gene_dbdescriptionprevnextTop
sub gene_db {
  my ($self, $gene_db) = @_;
  if ($gene_db){
    unless ($gene_db->isa("Bio::EnsEMBL::DBSQL::DBAdaptor")){
      throw("gene db is not a Bio::EnsEMBL::DBSQL::DBAdaptor, it is a $gene_db");
    }
    $self->{'_gene_db'} = $gene_db;
  }
  return $self->{'_gene_db'};
}
genesdescriptionprevnextTop
sub genes {
  my ($self, $genes) = @_;
  if ($genes) {
    foreach my $gene (@{$genes}) {
      unless  ($gene->isa("Bio::EnsEMBL::Gene")){
	throw("Input isn't a Bio::EnsEMBL::Gene, it is a $gene\n$@");
      }
    }
    $self->{'_genes'} = $genes;
  }
  return $self->{'_genes'};
}
get_all_repeat_blocksdescriptionprevnextTop
sub get_all_repeat_blocks {
  my ($self,$repeat_ref) = @_;
  my @repeat_blocks;
  my @repeats = @{$repeat_ref};
  @repeats = sort {$a->start <=> $b->start} @repeats;
  my $curblock = undef;

 REPLOOP: foreach my $repeat (@repeats) {
    my $rc = $repeat->repeat_consensus;
    my $use = 0;
    foreach my $type (@{$self->PS_REPEAT_TYPES}){
      if ($rc->repeat_class =~ /$type/) {
	$use = 1;
	last;
      }
    }
    next REPLOOP unless $use;
    if ($repeat->start <= 0) { 
      $repeat->start(1); 
    }
    if (defined($curblock) && $curblock->end >= $repeat->start) {
      if ($repeat->end > $curblock->end) { 
	$curblock->end($repeat->end); 
      }
    } else {
      $curblock = Bio::EnsEMBL::Feature->new(
						-START => $repeat->start,
						-END => $repeat->end, 
						-STRAND => $repeat->strand
					    );
      push (@repeat_blocks,$curblock);
    }
  }
    @repeat_blocks = sort {$a->start <=> $b->start} @repeat_blocks;
  return\@repeat_blocks;
}
ignored_genesdescriptionprevnextTop
sub ignored_genes {
  my ($self, $ignored_gene) = @_;
  if ($ignored_gene) {
    unless ($ignored_gene->isa("Bio::EnsEMBL::Gene")){
      throw("ignored gene is not a Bio::EnsEMBL::Gene, it is a $ignored_gene");
    }
    push @{$self->{'_ignored_gene'}},$self->lazy_load($ignored_gene);
  }
  return $self->{'_ignored_gene'};
}

#==================================================================
}
lazy_loaddescriptionprevnextTop
sub lazy_load {
  my ($self, $gene) = @_;
  if ($gene){
    unless ($gene->isa("Bio::EnsEMBL::Gene")){
      throw("gene is not a Bio::EnsEMBL::Gene, it is a $gene");
    }
    foreach my $trans(@{$gene->get_all_Transcripts}){
      my $transl = $trans->translation; 
       $trans->get_all_supporting_features() ; 
      if ($transl){
	$transl->get_all_ProteinFeatures;
      }
    }
  }
  return $gene;
}
make_runnabledescriptionprevnextTop
sub make_runnable {
  my ($self) = @_;
      
  my $runnable = Bio::EnsEMBL::Analysis::Runnable::Pseudogene->new
    ( 
     -analysis                     => $self->analysis,
     -genes                        => $self->genes,
     -repeat_features              => $self->repeat_blocks,
     -PS_REPEAT_TYPES              => $self->PS_REPEAT_TYPES,
     -PS_FRAMESHIFT_INTRON_LENGTH  => $self->PS_FRAMESHIFT_INTRON_LENGTH,
     -PS_MAX_INTRON_LENGTH         => $self->PS_MAX_INTRON_LENGTH,
     -PS_MAX_INTRON_COVERAGE       => $self->PS_MAX_INTRON_COVERAGE,
     -PS_MAX_EXON_COVERAGE         => $self->PS_MAX_EXON_COVERAGE,
     -PS_NUM_FRAMESHIFT_INTRONS    => $self->PS_NUM_FRAMESHIFT_INTRONS,
     -PS_NUM_REAL_INTRONS          => $self->PS_NUM_REAL_INTRONS,
     -SINGLE_EXON                  => $self->SINGLE_EXON,
     -INDETERMINATE                => $self->INDETERMINATE,
     -PS_MIN_EXONS                 => $self->PS_MIN_EXONS,
     -PS_MULTI_EXON_DIR            => $self->PS_MULTI_EXON_DIR,
     -BLESSED_BIOTYPES             => $self->BLESSED_BIOTYPES,
     -PS_PSEUDO_TYPE               => $self->PS_PSEUDO_TYPE,
     -PS_BIOTYPE                   => $self->PS_BIOTYPE,
     -PS_REPEAT_TYPE              => $self->PS_REPEAT_TYPE,
     -DEBUG                        => $self->DEBUG,
    );
  $self->runnable($runnable);
}
newdescriptionprevnextTop
sub new {
  my ($class,@args) = @_;
  my $self = $class->SUPER::new(@args);
  $self->read_and_check_config($PSEUDOGENE_CONFIG_BY_LOGIC);
  return $self;
}
pseudo_genesdescriptionprevnextTop
sub pseudo_genes {
  my ($self, $pseudo_gene) = @_;
  if ($pseudo_gene) {
    unless ($pseudo_gene->isa("Bio::EnsEMBL::Gene")){
      throw("pseudo gene is not a Bio::EnsEMBL::Gene, it is a $pseudo_gene");
    }
    push @{$self->{'_pseudo_gene'}},$self->lazy_load($pseudo_gene);
  }
  return $self->{'_pseudo_gene'};
}
real_genesdescriptionprevnextTop
sub real_genes {
  my ($self, $real_gene) = @_;
  if ($real_gene) {
    unless ($real_gene->isa("Bio::EnsEMBL::Gene")){
      throw("real gene is not a Bio::EnsEMBL::Gene, it is a $real_gene");
    }
    push @{$self->{'_real_gene'}},$self->lazy_load($real_gene);
  }
  return $self->{'_real_gene'};
}
rep_dbdescriptionprevnextTop
sub rep_db {
  my ($self, $rep_db) = @_;
  if ($rep_db){
    unless ($rep_db->isa("Bio::EnsEMBL::DBSQL::DBAdaptor")){
      throw("gene db is not a Bio::EnsEMBL::DBSQL::DBAdaptor, it is a $rep_db");
    }
    $self->{'_rep_db'} = $rep_db;
  }
  return $self->{'_rep_db'};
}
repeat_blocksdescriptionprevnextTop
sub repeat_blocks {
  my ($self, $val) = @_;

  if (defined $val) {
    $self->{_repeat_blocks} = $val;
  }
  return $self->{_repeat_blocks};
}
rundescriptionprevnextTop
sub run {
  my ($self) = @_;
  foreach my $runnable (@{$self->runnable}) {
    throw("Runnable module not set") unless ($runnable->isa("Bio::EnsEMBL::Analysis::Runnable"));
    $runnable->run();
    $self->output($runnable->output);
    if ($self->SINGLE_EXON){
      $self->store_ids($runnable->single_exon_genes,$self->SPLICED_ELSEWHERE_LOGIC_NAME);
    }    
    if ($self->INDETERMINATE){
      $self->store_ids($runnable->indeterminate_genes,$self->PSILC_LOGIC_NAME);
    }
  }
  return 0;
}
store_idsdescriptionprevnextTop
sub store_ids {
  my ($self, $id_list,$analysis_name) = @_;

  my $flag_adaptor = Bio::EnsEMBL::Pipeline::DBSQL::FlagAdaptor->new($self->db);
  my $analysis_adaptor = $self->db->get_AnalysisAdaptor;
  my $analysis = $analysis_adaptor->fetch_by_logic_name($analysis_name);
  unless ($analysis) {  
     throw( "analysis " . $analysis_name  . " can't be found in " . $self->db->dbname ." @ " . $self->db->host.  "\n")  ; 
  } 
# What do you do if the analysis thing isnt found?
return 0 unless (scalar(@{$id_list} >0)); foreach my $id(@{$id_list}){ my $flag = Bio::EnsEMBL::Pipeline::Flag->new( '-type' => 'gene', '-ensembl_id' => $id, '-goalAnalysis' => $analysis, ); $flag_adaptor->store($flag); } return;
}
transcript_to_keepdescriptionprevnextTop
sub transcript_to_keep {
  my ($self, $trans_to_keep)  = @_;
  return if  $self->BLESSED_BIOTYPES->{$trans_to_keep->biotype};
  $trans_to_keep->translation(undef);
  return;
}


############################################################################
# container methods
}
write_outputdescriptionprevnextTop
sub write_output {
my($self) = @_;
  my $genes = $self->output; 
  if ( $self->PS_WRITE_IGNORED_GENES == 1 ) {  
    print "writing ignored genes\n" ; 
    push @{$genes},@{$self->ignored_genes} if $self->ignored_genes;  
  }  
  
    my %feature_hash;
  #  empty_Analysis_cache();
# write genes out to a different database from the one we read genes from.
my $db = $self->get_dbadaptor($self->PS_OUTPUT_DATABASE) ; #now using Analysis:Databases
print "Writing to database PS_OUTPUT_DATABASE : " . $self->PS_OUTPUT_DATABASE . "\n"; # sort out analysis
my $analysis = $self->analysis; unless ($analysis){ throw("an analysis logic name must be defined in the command line"); } my $gene_adaptor = $db->get_GeneAdaptor; foreach my $gene (@{$genes}) { # store gene
eval { $gene_adaptor->store($self->lazy_load($gene)); }; if ( $@ ) { warning("UNABLE TO WRITE GENE:\n$@"); } } return 1;
}
General documentation
CONTACTTop
Post questions to the Ensembl development list: ensembl-dev@ebi.ac.uk
APPENDIXTop
remove_transcript_from_geneTop
  Args       : Bio::EnsEMBL::Gene object , Bio::EnsEMBL::Transcript object
Description: steves method for removing unwanted transcripts from genes
Returntype : scalar