Bio::EnsEMBL::Funcgen
DataSet
Toolbar
Summary
Bio::EnsEMBL::Funcgen::DataSet - A module to represent DataSet object.
Package variables
No package variables defined.
Included modules
Inherit
Synopsis
use Bio::EnsEMBL::Funcgen::DataSet;
my $data_set = Bio::EnsEMBL::Funcgen::DataSet->new(
-DBID => $dbID,
-ADAPTOR => $self,
-SUPPORTING_SETS => [$rset],
-FEATURE_SET => $fset,
-DISPLAYABLE => 1,
-NAME => 'DATASET1',
);
Description
A DataSet object provides access to either or both raw results and AnnotatedFeatures
for a given experiment within a Slice, associated with set wide experimental meta data.
This was aimed primarily at easing access to data via the web API by creating
a wrapper class with convenience methods. The focus of this class is to contain raw and
associated processed/analysed data to be displayed as a set within the browser i.e. an
experiment may have different cell lines, features or time points, these would require different DataSets.
#However a DataSet may contain mixed data types i.e. promoter & histone???? No give separate sets?
May have duplicates for raw data but only one predicted features track??
The data in this class is kept as lightweight as possible with data being loaded dynamically.
SOME IMPORTANT ISSUES/DEFINITIONS
This class current only accomodates the following relationships:
SIMPLE - feature_set to result_set(s) relationships. This is one feature_set/type with a supporting
result_set or sets from the same experiment.
COMPOUND - feature_set to result_sets relationship. Where we have one feature_set/type supported by
numerous result_sets which may have different analyses from different experiments.
Both SIMPLE and COMPOUND also assume all other variables are the same e.g. cell_type, time_point etc.
This class does not accomodate the following:
COMPLEX - Multiple feature_types, feature classes, cell_types etc... Where the only assumtion
is that their is one constant variable which can be keyed on. This could potentially capture any experiment design.
e.g. A combined promoter and histone tiling experiment which has features and results for promoter and all modifications,
but using the same cell line and conditions.
e.g. Looking at the same histone modifications across multiple cell_types
e.g. Looking at time points within an experiment
Final goal of visualisation will be a track of regulons/functional features supported by a network of
feature_types/classes from different cell_types, some relationships may be indirect.
Methods
Methods description
Arg [1] : Bio::EnsEMBL::Feature/ResultSet object Example : $dset->_validate_and_set_types($rset); Description: Validates and sets DataSet cell and feature types Returntype : none Exceptions : Throws if types not valid Caller : General Status : At Risk |
Arg [1] : Bio::EnsEMBL::ResultSet Arg [2] : (optional) string - status e.g. 'DISPLAYABLE' Example : $dset->add_ResultSet($rset); Description: Adds ResultSets to the DataSet Returntype : none Exceptions : Throws if CellType or FeatureType do not match or if member_set_type is not 'result' Caller : General Status : At Risk - to be removed |
Arg [1] : Array of Bio::EnsEMBL::Feature/ResultSet object Example : $dset->add_supporting_sets($rset); Description: Adds Result/FeatureSets to the DataSet Returntype : none Exceptions : Throws if set not valid for supporting_set type of DataSet Throws if supporting_sets is not an array ref Caller : General Status : At Risk |
Example : my $dset_ctype_name = $dset->cell_type->name(); Description: Getter for the cell_type for this DataSet. Returntype : Bio::EnsEMBL::Funcgen::CellType Exceptions : None Caller : General Status : At Risk |
Example : print $rset->display_label(); Description: Getter for the display_label attribute for this DataSet. This is more appropriate for teh predicted_features of the set. Use the individual display_labels for each raw result set. Returntype : str Exceptions : None Caller : General Status : At Risk |
Example : my $dset_ftype_name = $dset->feature_type->name(); Description: Getter for the feature_type for this DataSet. Returntype : Bio::EnsEMBL::Funcgen::FeatureType Exceptions : None Caller : General Status : At Risk |
Arg [1] : (optional) status - e.g 'DISPLAYABLE' Example : my @status_sets = @{$result_set->get_ResultSets($status)}; Description: Getter for the ResultSets for this DataSet. Returntype : Arrayref Exceptions : None Caller : General Status : At Risk |
Arg [1] : Bio::EnsEMBL::Funcgen:Analysis Arg [2] : (optional) status - e.g 'DISPLAYABLE' Example : my $anal_sets = @{$result_set->get_ResultSets_by_Analysis($analysis)}; Description: Getter for the ResultSet of given Analysis for this DataSet. Returntype : Arrayref Exceptions : Throws if arg is not a valid stored Bio::EnsEMBL::Anaylsis Caller : General Status : At Risk |
Example : my @displayable_rsets = @{$result_set->get_displayable_ResultSets()}; Description: Convenience method for web display Returntype : Arrayref Exceptions : None Caller : General Status : At Risk - to be removed |
Example : my $fset = $data_set->get_displayable_product_FeatureSet(); Description: Convenience method for web display Returntype : Bio::EnsEMBL::Funcgen::FeatureSet Exceptions : None Caller : General Status : At Risk |
Example : my @displayable_rsets = @{$result_set->get_displayable_supporting_sets()}; Description: Convenience method for web display Returntype : Arrayref Exceptions : None Caller : General Status : At Risk |
Arg [1] : (optional) status - e.g 'DISPLAYABLE' Example : my @status_sets = @{$data_set->get_supporting_sets($status)}; Description: Getter for the ResultSets for this DataSet. Returntype : Arrayref Exceptions : None Caller : General Status : At Risk |
Arg [1] : Bio::EnsEMBL::Funcgen:Analysis Arg [2] : (optional) status - e.g 'DISPLAYABLE' Example : my $anal_sets = @{$result_set->get_ResultSets_by_Analysis($analysis)}; Description: Getter for the SupportingSet objects of a given Analysis. Returntype : ARRAYREF Exceptions : Throws if arg is not a valid stored Bio::EnsEMBL::Anaylsis Caller : General Status : At Risk |
Example : my $dset->name('DATASET1'); Description: Getter/Setter for the name of this DataSet. Returntype : string Exceptions : None Caller : General Status : At Risk |
Example : my $dset = Bio::EnsEMBL::Funcgen::DataSet->new( -SUPPORTING_SETS => [$fset1, $fset2], -FEATURE_SET => $fset, -DISPLAYABLE => 1, -NAME => 'DATASET1', );
#for COMPLEX DataSet could use this, where 1 and 2 are the positions they are to be returned in #Would also need to record what the display type would be for each set, so the webcode can do it dynamically. #This would allow any config of display based on what is defined in the DB.
Description: Constructor for DataSet objects. Returntype : Bio::EnsEMBL::Funcgen::DataSet Exceptions : Throws if no experiment_id defined Caller : General Status : At risk |
Arg [1] : (optional) Bio::EnsEMBL::Funcgen::FeatureSet Example : $data_set->product_FeatureSet($fset); Description: Getter and setter for the main feature_set attribute for this DataSet. Returntype : Bio::EnsEMBL::Funcgen::FeatureSet Exceptions : Throws not a valid FeatureSet or if main feature_set has already been set. Caller : General Status : At Risk - change to get_product_FeatureSet |
Example : my $dset->supporting_set_type('feature'); Description: Getter/Setter for the supporting_set type of this DataSet i.e. feature or result. Returntype : string Exceptions : None Caller : General Status : At Risk |
Methods code
sub _validate_and_set_types
{ my ($self, $set) = @_;
for my $type('feature_type', 'cell_type'){
if(defined $self->{$type}){
if($set->{$type}->name() ne $self->{$type}->name()){
throw(ref($set).' feature_type('.$set->{$type}->name().
") does not match DataSet feature_type(".$self->{$type}->name().")");
}
}
else{
$self->{$type} = $set->{$type};
}
}
return; } |
sub add_ResultSet
{ my ($self, $rset, $displayable) = @_;
deprecate('add_ResultSet is deprecated, Please use add_supporting_sets()');
return $self->add_supporting_sets([$rset]); } |
sub add_supporting_sets
{ my ($self, $sets) = @_;
throw("Supporting sets need to be a reference to an ARRAY:\t".$sets) if ref($sets) ne 'ARRAY';
foreach my $set(@$sets){
if(!(ref($set) && $set->isa('Bio::EnsEMBL::Funcgen::Set') && $set->set_type ne 'data' && $set->dbID)){
throw("Need to pass a valid stored Bio::EnsEMBL::Funcgen::Set which is not a DataSet");
}
$self->_validate_and_set_types($set) if $set->set_type() ne 'feature';
$self->{'supporting_sets'}->{$set->analysis->dbID()} ||= ();
push @{$self->{'supporting_sets'}->{$set->analysis->dbID()}}, $set;
}
return; } |
sub cell_type
{ my $self = shift;
return $self->{'cell_type'}; } |
sub display_label
{ my $self = shift;
if(! $self->{'display_label'}){
if($self->product_FeatureSet->feature_type->class() eq 'Regulatory Feature'){
$self->{'display_label'} = 'Regulatory Features';
}
else{
$self->{'display_label'} = $self->feature_type->name()." -";
$self->{'display_label'} .= " ".($self->cell_type->display_label() ||
$self->cell_type->description() ||
$self->cell_type()->name());
$self->{'display_label'} .= " Enriched Sites";
}
}
return $self->{'display_label'};
}
1; } |
sub feature_type
{ my $self = shift;
return $self->{'feature_type'}; } |
sub get_ResultSets
{ my $self = shift;
deprecate('Use get_supporting_sets instead');
$self->get_supporting_sets(@_); } |
sub get_ResultSets_by_Analysis
{ my $self = shift;
deprecate('Use get_supporting_sets_by_Analysis instead');
return $self->get_supporting_sets_by_Analysis(@_); } |
sub get_displayable_ResultSets
{ my $self = shift;
deprecate('Use get_displayable_supporting_sets instead');
return $self->get_supporting_sets('DISPLAYABLE'); } |
sub get_displayable_product_FeatureSet
{ my $self = shift;
return $self->product_FeatureSet->has_status('DISPLAYABLE') ? $self->product_FeatureSet() : undef; } |
sub get_displayable_supporting_sets
{ my $self = shift;
return $self->get_supporting_sets('DISPLAYABLE'); } |
sub get_supporting_sets
{ my ($self, $status) = @_;
my @rsets;
foreach my $anal_id(keys %{$self->{'supporting_sets'}}){
foreach my $rset(@{$self->{'supporting_sets'}->{$anal_id}}){
if(! defined $status){
push @rsets , $rset;
}elsif($rset->has_status($status)){
push @rsets, $rset;
}
}
}
return\@ rsets; } |
sub get_supporting_sets_by_Analysis
{ my ($self, $analysis, $status) = @_;
my @rsets;
if (! ($analysis->isa("Bio::EnsEMBL::Analysis") && $analysis->dbID())){
throw("Need to pass a valid stored Bio::EnsEMBL::Funcgen::ResultSet");
}
foreach my $anal_rset(@{$self->{'supporting_sets'}->{$analysis->dbID()}}){
if(! defined $status){
push @rsets, $anal_rset;
}
elsif($anal_rset->has_status($status)){
push @rsets, $anal_rset;
}
}
return\@ rsets; } |
sub name
{ my $self = shift;
$self->{'name'} = shift if @_;
return $self->{'name'}; } |
sub new
{ my $caller = shift;
my $class = ref($caller) || $caller;
my $self = $class->SUPER::new(@_);
my ($fset, $sets, $name)
= rearrange(['FEATURE_SET', 'SUPPORTING_SETS', 'NAME'], @_);
my @caller = caller();
if($self->dbID() && $caller[0] ne "Bio::EnsEMBL::Funcgen::DBSQL::DataSetAdaptor"){
throw("You must use the DataSetAdaptor to generate DataSets with dbID i.e. from the DB, as this module accomodates updating which may cause incorrect data if the object is not generated from the DB");
}
$self->{'supporting_sets'} ||= {};
throw("Must specify at least one Result/FeatureSet") if((! $sets) && (! $fset));
$self->add_supporting_sets($sets) if $sets;
$self->product_FeatureSet($fset) if $fset;
$self->name($name) if $name;
return $self;
}
} |
sub product_FeatureSet
{ my ($self, $fset) = @_;
if($fset){
if (! ($fset && ref($fset) && $fset->isa("Bio::EnsEMBL::Funcgen::FeatureSet"))){
throw("Need to pass a valid Bio::EnsEMBL::Funcgen::FeatureSet")
}
if(defined $self->{'feature_set'}){
throw("The main feature_set has already been set for this DataSet, maybe you want add_SupportingSets?");
}
else{
$self->_validate_and_set_types($fset);
$self->{'feature_set'} = $fset;
}
}
return $self->{'feature_set'}; } |
sub supporting_set_type
{ my $self = shift;
throw('This method is deprecated as DataSets can have different supporting set types');
$self->{'supporting_set_type'} = shift if @_;
return $self->{'supporting_set_type'};
}
} |
General documentation
This module was created by Nathan Johnson.
This module is part of the Ensembl project:
/
Description: Getter for the result_set_ids for this DataSet.
Returntype : LIST
Exceptions : None
Caller : General
Status : At Risk