Bio::Graphics
FeatureFile
Toolbar
Summary
Bio::Graphics::FeatureFile -- A set of Bio::Graphics features, stored in a file
Package variables
Privates (from "my" definitions)
@COLORS = qw(cyan blue red yellow green wheat turquoise orange)
Included modules
Carp
IO::File
Text::Shellwords
Synopsis
use Bio::Graphics::FeatureFile;
my $data = Bio::Graphics::FeatureFile->new(-file => 'features.txt');
# create a new panel and render contents of the file onto it
my $panel = $data->new_panel;
my $tracks_rendered = $data->render($panel);
# or do it all in one step
my ($tracks_rendered,$panel) = $data->render;
# for more control, render tracks individually
my @feature_types = $data->types;
for my $type (@feature_types) {
my $features = $data->features($type);
my %options = $data->style($type);
$panel->add_track($features,%options); # assuming we have a Bio::Graphics::Panel
}
# get individual settings
my $est_fg_color = $data->setting(EST => 'fgcolor');
# or create the FeatureFile by hand
# add a type
$data->add_type(EST => {fgcolor=>'blue',height=>12});
# add a feature
my $feature = Bio::Graphics::Feature->new(
# params
); # or some other SeqI
$data->add_feature($feature=>'EST');
Description
The
Bio::Graphics::FeatureFile module reads and parses files that
describe sequence features and their renderings. It accepts both GFF
format and a more human-friendly file format described below. Once a
FeatureFile object has been initialized, you can interrogate it for
its consistuent features and their settings, or render the entire file
onto a Bio::Graphics::Panel.
This moduel is a precursor of Jason Stajich's
Bio::Annotation::Collection class, and fulfills a similar function of
storing a collection of sequence features. However, it also stores
rendering information about the features, and does not currently
follow the CollectionI interface.
There are two types of entry in the file format: feature entries, and
formatting entries. They can occur in any order. See the Appendix
for a full example.
Feature entries can take several forms. At their simplest, they look
like this:
Gene B0511.1 516-11208
This means that a feature of type "Gene" and name "B0511.1" occupies
the range between bases 516 and 11208. A range can be specified
equally well using a hyphen, or two dots as in 516..11208. Negative
coordinates are allowed, such as -187..1000.
A discontinuous range ("split location") uses commas to separate the
ranges. For example:
Gene B0511.1 516-619,3185-3294,10946-11208
Alternatively, the locations can be split by repeating the features
type and name on multiple adjacent lines:
Gene B0511.1 516-619
Gene B0511.1 3185-3294
Gene B0511.1 10946-11208
A comment can be added to features by adding a fourth column. These
comments will be rendered as under-the-glyph descriptions by those
glyphs that honor descriptions:
Gene B0511.1 516-619,3185-3294,10946-11208 "Putative primase"
Columns are separated using whitespace, not (necessarily) tabs.
Embedded whitespace can be escaped using quote marks or backslashes in
the same way as in the shell:
'Putative Gene' my\ favorite\ gene 516-11208
Features can be grouped so that they are rendered by the "group" glyph
(so far this has only been used to relate 5' and 3' ESTs). To start a
group, create a two-column feature entry showing the group type and a
name for the group. Follow this with a list of feature entries with a
blank type. For example:
EST yk53c10
yk53c10.3 15000-15500,15700-15800
yk53c10.5 18892-19154
This example is declaring that the ESTs named yk53c10.3 and yk53c10.5
belong to the same group named yk53c10.
Methods
_setting | No description | Code |
_stat | No description | Code |
_unescape | No description | Code |
add_feature | No description | Code |
add_type | No description | Code |
base2package | No description | Code |
citation | No description | Code |
code_setting | No description | Code |
configured_types | No description | Code |
ctime | No description | Code |
destroy | No description | Code |
error | No description | Code |
evaluate_coderefs | No description | Code |
feature2label | No description | Code |
features | No description | Code |
finish_parse | No description | Code |
get_seq_stream | Description | Code |
glyph | No description | Code |
init_parse | No description | Code |
initialize_code | No description | Code |
invert_types | No description | Code |
link_pattern | No description | Code |
make_link | No description | Code |
make_strand | No description | Code |
max | No description | Code |
min | No description | Code |
mtime | No description | Code |
name | No description | Code |
new | No description | Code |
new_panel | No description | Code |
parse_argv | No description | Code |
parse_file | No description | Code |
parse_line | No description | Code |
parse_text | No description | Code |
refs | No description | Code |
render | No description | Code |
safe | No description | Code |
set | No description | Code |
setting | No description | Code |
size | No description | Code |
smart_features | No description | Code |
split_group | No description | Code |
style | No description | Code |
type2label | No description | Code |
types | No description | Code |
Methods description
Title : get_seq_stream Usage : $stream = $s->get_seq_stream(@args) Function: get a stream of features that overlap this segment Returns : a Bio::SeqIO::Stream-compliant stream Args : see below Status : Public
This is the same as feature_stream(), and is provided for Bioperl compatibility. Use like this:
$stream = $s->get_seq_stream('exon'); while (my $exon = $stream->next_seq) { print $exon->start,"\n"; } |
Methods code
sub _setting
{ my $self = shift;
my $config = $self->{config} or return;
return keys %{$config} unless @_;
return keys %{$config->{$_[0]}} if @_ == 1;
return $config->{$_[0]}{$_[1]} if @_ > 1; } |
sub _stat
{ my $self = shift;
my $fh = shift;
$self->{stat} = [stat($fh)]; } |
sub _unescape
{ foreach (@_) {
tr/+/ /; s/%([0-9a-fA-F]{2})/chr hex($1)/g;
}
@_; } |
sub add_feature
{ my $self = shift;
my ($feature,$type) = @_;
$type = $feature->primary_tag unless defined $type;
push @{$self->{features}{$type}},$feature; } |
sub add_type
{ my $self = shift;
my ($type,$type_configuration) = @_;
my $cc = $type =~ /^(general|default)$/i ? 'general' : $type; push @{$self->{types}},$cc unless $cc eq 'general' or $self->{config}{$cc};
if (defined $type_configuration) {
for my $tag (keys %$type_configuration) {
$self->{config}{$cc}{lc $tag} = $type_configuration->{$tag};
}
} } |
sub base2package
{ my $self = shift;
(my $package = overload::StrVal($self)) =~ s/[^a-z0-9A-Z_]/_/g;
$package =~ s/^[^a-zA-Z_]/_/g;
$package; } |
sub citation
{ my $self = shift;
my $feature = shift || 'general';
return $self->setting($feature=>'citation'); } |
sub code_setting
{ my $self = shift;
my $section = shift;
my $option = shift;
my $setting = $self->_setting($section=>$option);
return unless defined $setting;
return $setting if ref($setting) eq 'CODE';
if ($setting =~ /^\\&(\w+)/) { my $subroutine_name = $1;
my $package = $self->base2package;
my $codestring = "\\&${package}\:\:${subroutine_name}";
my $coderef = eval $codestring;
warn $@ if $@;
$self->set($section,$option,$coderef);
return $coderef;
}
elsif ($setting =~ /^sub\s*\{/) {
my $coderef = eval $setting;
warn $@ if $@;
$self->set($section,$option,$coderef);
return $coderef;
} else {
return $setting;
} } |
sub configured_types
{ my $self = shift;
my $types = $self->{types} or return;
return @{$types}; } |
sub ctime
{ shift->{stat}->[10]; } |
sub destroy
{ my $self = shift;
delete $self->{features}; } |
sub error
{ my $self = shift;
my $d = $self->{error};
$self->{error} = shift if @_;
$d; } |
sub evaluate_coderefs
{ my $self = shift;
$self->initialize_code();
for my $s ($self->_setting) {
for my $o ($self->_setting($s)) {
$self->code_setting($s,$o);
}
} } |
sub feature2label
{ my $self = shift;
my $feature = shift;
my $type = eval {$feature->type} || $feature->primary_tag or return;
(my $basetype = $type) =~ s/:.+$//;
my @labels = $self->type2label($type);
@labels = $self->type2label($basetype) unless @labels;
@labels = ($type) unless @labels;;
wantarray ? @labels : $labels[0]; } |
sub features
{ my $self = shift;
my ($types,$iterator,@rest) = $_[0]=~/^-/ ? rearrange([['TYPE','TYPES']],@_) : (\@_);
$types = [$types] if $types && !ref($types);
my @types = ($types && @$types) ? @$types : $self->types;
my @features = map {@{$self->{features}{$_}}} @types;
if ($iterator) {
require Bio::Graphics::FeatureFile::Iterator;
return Bio::Graphics::FeatureFile::Iterator->new(\@features);
}
return wantarray ? @features :\@ features; } |
sub finish_parse
{ my $s = shift;
$s->evaluate_coderefs if $s->safe;
$s->{seenit} = {}; } |
sub get_seq_stream
{ my $self = shift;
local $^W = 0;
my @args = $_[0] =~ /^-/ ? (@_,-iterator=>1) : (-types=>\@_,-iterator=>1);
$self->features(@args); } |
sub glyph
{ my $self = shift;
my $type = shift;
my $config = $self->{config} or return;
my $hashref = $config->{$type} or return;
return $hashref->{glyph}; } |
sub init_parse
{ my $s = shift;
$s->{seenit} = {};
$s->{max} = $s->{min} = undef;
$s->{types} = [];
$s->{features} = {};
$s->{config} = {} } |
sub initialize_code
{ my $self = shift;
my $package = $self->base2package;
my $init_code = $self->_setting(general => 'init_code') or return;
my $code = "package $package; $init_code; 1;";
eval $code;
warn $@ if $@; } |
sub invert_types
{ my $self = shift;
my $config = $self->{config} or return;
my %inverted;
for my $label (keys %{$config}) {
my $feature = $config->{$label}{feature} or next;
foreach (shellwords($feature||'')) {
$inverted{$_}{$label}++;
}
}\%
inverted; } |
sub link_pattern
{ my $self = shift;
my ($pattern,$feature,$panel) = @_;
require CGI unless defined &CGI::escape;
my $n;
$pattern =~ s/\$(\w+)/ CGI::escape( $1 eq 'ref' ? ($n = $feature->location->seq_id) && "$n" : $1 eq 'name' ? ($n = $feature->display_name) && "$n" # workaround broken CGI.pm : $1 eq 'class' ? eval {$feature->class} || '' : $1 eq 'type' ? eval {$feature->method} || $feature->primary_tag : $1 eq 'method' ? eval {$feature->method} || $feature->primary_tag : $1 eq 'source' ? eval {$feature->source} || $feature->source_tag : $1 eq 'start' ? $feature->start : $1 eq 'end' ? $feature->end : $1 eq 'stop' ? $feature->end : $1 eq 'segstart' ? $panel->start : $1 eq 'segend' ? $panel->end : $1 eq 'description' ? eval {join '',$feature->notes} || '' : $1 ) /exg;
return $pattern;
}
} |
sub make_link
{ my $self = shift;
my $feature = shift;
for my $label ($self->feature2label($feature)) {
my $link = $self->setting($label,'link');
$link = $self->setting(general=>'link') unless defined $link;
next unless $link;
return $self->link_pattern($link,$feature);
}
return; } |
sub make_strand
{ local $^W = 0;
return +1 if $_[0] =~ /^\+/ || $_[0] > 0;
return -1 if $_[0] =~ /^\-/ || $_[0] < 0;
return 0; } |
sub mtime
{ my $self = shift;
my $d = $self->{m_time} || $self->{stat}->[9];
$self->{m_time} = shift if @_;
$d; } |
sub name
{ my $self = shift;
my $d = $self->{name};
$self->{name} = shift if @_;
$d;
}
1;
__END__ } |
sub new
{ my $class = shift;
my %args = @_;
my $self = bless {
config => {},
features => {},
seenit => {},
types => [],
max => undef,
min => undef,
stat => [],
refs => {},
safe => undef,
},$class;
$self->{coordinate_mapper} = $args{-map_coords}
if exists $args{-map_coords} && ref($args{-map_coords}) eq 'CODE';
$self->{smart_features} = $args{-smart_features} if exists $args{-smart_features};
$self->{safe} = $args{-safe} if exists $args{-safe};
my $fh;
if (my $file = $args{-file}) {
no strict 'refs';
if (defined fileno($file)) {
$fh = $file;
} elsif ($file eq '-') {
$self->parse_argv();
} else {
$fh = IO::File->new($file) or croak("Can't open $file: $!\n");
}
$self->parse_file($fh);
} elsif (my $text = $args{-text}) {
$self->parse_text($text);
}
close($fh) or warn "Error closing file: $!" if $fh;
$self;
}
} |
sub new_panel
{ my $self = shift;
require Bio::Graphics::Panel;
my $width = $self->setting(general => 'pixels')
|| $self->setting(general => 'width')
|| WIDTH;
my ($start,$stop);
my $range_expr = '(-?\d+)(?:-|\.\.)(-?\d+)';
if (my $bases = $self->setting(general => 'bases')) {
($start,$stop) = $bases =~ /([\d-]+)(?:-|\.\.)([\d-]+)/;
}
if (!defined $start || !defined $stop) {
$start = $self->min unless defined $start;
$stop = $self->max unless defined $stop;
}
my $new_segment = Bio::Graphics::Feature->new(-start=>$start,-stop=>$stop);
my $panel = Bio::Graphics::Panel->new(-segment => $new_segment,
-width => $width,
-key_style => 'between');
$panel; } |
sub parse_argv
{ my $self = shift;
$self->init_parse;
while (<>) {
chomp;
$self->parse_line($_);
}
$self->finish_parse; } |
sub parse_file
{ my $self = shift;
my $fh = shift or return;
$self->_stat($fh);
$self->init_parse;
while (<$fh>) {
chomp;
$self->parse_line($_);
}
$self->finish_parse; } |
sub parse_line
{ my $self = shift;
local $_ = shift;
s/\015//g;
return if /^\s*[\#]/;
if (/^\s+(.+)/ && $self->{current_tag}) { my $value = $1;
my $cc = $self->{current_config} ||= 'general'; $self->{config}{$cc}{$self->{current_tag}} .= ' ' . $value;
$self->{config}{$cc}{$self->{current_tag}} .= "\n"
if $self->{config}{$cc}{$self->{current_tag}}=~ /^sub\s*\{/;
return;
}
if (/^\s*\[([^\]]+)\]/) { my $label = $1;
my $cc = $label =~ /^(general|default)$/i ? 'general' : $label; push @{$self->{types}},$cc unless $cc eq 'general';
$self->{current_config} = $cc;
return;
}
if (/^([\w: -]+?)\s*=\s*(.*)/) { my $tag = lc $1;
my $cc = $self->{current_config} ||= 'general'; my $value = defined $2 ? $2 : '';
$self->{config}{$cc}{$tag} = $value;
$self->{current_tag} = $tag;
return;
}
if (/^$/) { undef $self->{current_tag};
return;
}
my @tokens = eval { shellwords($_||'') };
unshift @tokens,'' if /^\s+/;
if (length $tokens[0] > 0 && $self->{group}) {
push @{$self->{features}{$self->{grouptype}}},$self->{group};
undef $self->{group};
undef $self->{grouptype};
}
if (@tokens < 3) { my $type = shift @tokens;
my $name = shift @tokens;
$self->{group} = Bio::Graphics::Feature->new(-name => $name,
-type => 'group');
$self->{grouptype} = $type;
return;
}
my($ref,$type,$name,$strand,$bounds,$description,$url);
if (@tokens >= 8) { my ($r,$source,$method,$start,$stop,$score,$s,$phase,@rest) = @tokens;
my $group = join ' ',@rest;
$type = join(':',$method,$source);
$bounds = join '..',$start,$stop;
$strand = $s;
if ($group) {
my ($notes,@notes);
(undef,$self->{groupname},undef,undef,$notes) = split_group($group);
foreach (@$notes) {
if (m!^(http|ftp)://!) { $url = $_ } else { push @notes,$_ } } $description = join '; ',@notes if @notes; }
$name ||= $self->{group}->display_id if $self->{group};
$ref = $r;
}
elsif ($tokens[2] =~ /^([+-.]|[+-]?[01])$/) { ($type,$name,$strand,$bounds,$description,$url) = @tokens;
} else { ($type,$name,$bounds,$description,$url) = @tokens;
}
$type ||= $self->{grouptype} || '';
$type =~ s/\s+$//;
{
local $^W = 0;
$ref ||= $self->{config}{$self->{current_config}}{'reference'}
|| $self->{config}{general}{reference};
}
$self->{refs}{$ref}++ if defined $ref;
my @parts = map { [/(-?\d+)(?:-|\.\.)(-?\d+)/]} split /(?:,| )\s*/,$bounds;
foreach (@parts) { $self->{min} = $_->[0] if !defined $self->{min} || $_->[0] < $self->{min};
$self->{max} = $_->[1] if !defined $self->{max} || $_->[1] > $self->{max};
}
if ($self->{coordinate_mapper} && $ref) {
($ref,@parts) = $self->{coordinate_mapper}->($ref,@parts);
return unless $ref;
}
$type = '' unless defined $type;
$name = '' unless defined $name;
my %attributes;
my $score;
if (defined $description && $description =~ /\w+=\w+/) { my @attributes = split /;\s*/,$description;
foreach (@attributes) {
my ($name,$value) = split /=/,$_,2;
Bio::Root::Root->throw(qq("$_" is not a valid attribute=value pair)) unless defined $value;
_unescape($name);
my @values = split /,/,$value;
_unescape(@values);
if ($name =~ /^(note|description)/) {
$description = "@values";
} elsif ($name eq 'url') {
$url = $value;
} elsif ($name eq 'score') {
$score = $value;
} else {
push @{$attributes{$name}},@values;
}
}
}
if (my $feature = $self->{seenit}{$type,$name}) {
if (!$feature->segments) {
$feature->add_segment(Bio::Graphics::Feature->new(-type => $feature->type,
-strand => $feature->strand,
-start => $feature->start,
-end => $feature->end));
}
$feature->add_segment(@parts);
}
else {
my @coordinates = @parts > 1 ? (-segments =>\@ parts) : (-start=>$parts[0][0],-end=>$parts[0][1]);
$feature = $self->{seenit}{$type,$name} =
Bio::Graphics::Feature->new(-name => $name,
-type => $type,
$strand ? (-strand => make_strand($strand)) : (),
defined $score ? (-score=>$score) : (),
-desc => $description,
-ref => $ref,
-attributes =>\% attributes,
defined($url) ? (-url => $url) : (),
@coordinates,
);
$feature->configurator($self) if $self->smart_features;
if ($self->{group}) {
$self->{group}->add_segment($feature);
} else {
push @{$self->{features}{$type}},$feature; }
} } |
sub parse_text
{ my $self = shift;
my $text = shift;
$self->init_parse;
foreach (split /\015?\012|\015\012?/,$text) {
$self->parse_line($_);
}
$self->finish_parse; } |
sub refs
{ my $self = shift;
my $refs = $self->{refs} or return;
keys %$refs; } |
sub render
{ my $self = shift;
my $panel = shift;
my ($position_to_insert,$options,$max_bump,$max_label) = @_;
$panel ||= $self->new_panel;
my $tracks = 0;
my $color;
my %types = map {$_=>1} $self->configured_types;
my @configured_types = grep {exists $self->{features}{$_}} $self->configured_types;
my @unconfigured_types = sort grep {!exists $types{$_}} $self->types;
my @base_config = $self->style('general');
my @override = ();
if ($options && ref $options eq 'HASH') {
@override = %$options;
} else {
$options ||= 0;
if ($options == 1) { push @override,(-bump => 0,-label=>0);
} elsif ($options == 2) { push @override,(-bump=>1);
} elsif ($options == 3) { push @override,(-bump=>1,-label=>1);
} elsif ($options == 4) { push @override,(-bump => 2);
} elsif ($options == 5) { push @override,(-bump => 2,-label=>1);
}
}
for my $type (@configured_types,@unconfigured_types) {
my $features = $self->features($type);
my @auto_bump;
push @auto_bump,(-bump => @$features < $max_bump) if defined $max_bump;
push @auto_bump,(-label => @$features < $max_label) if defined $max_label;
my @config = ( -glyph => 'segments', -bgcolor => $COLORS[$color++ % @COLORS],
-label => 1,
-key => $type,
@auto_bump,
@base_config, $self->style($type), @override,
);
if (defined($position_to_insert)) {
$panel->insert_track($position_to_insert++,$features,@config);
} else {
$panel->add_track($features,@config);
}
$tracks++;
}
return wantarray ? ($tracks,$panel) : $tracks; } |
sub safe
{ my $self = shift;
my $d = $self->{safe};
$self->{safe} = shift if @_;
$self->evaluate_coderefs if $self->{safe} && !$d;
$d; } |
sub set
{ my $self = shift;
croak("Usage:\$ featurefile->set(\$type,\$tag,\$value\n")
unless @_ == 3;
my ($type,$tag,$value) = @_;
unless ($self->{config}{$type}) {
return $self->add_type($type,{$tag=>$value});
} else {
$self->{config}{$type}{lc $tag} = $value;
}
}
} |
sub setting
{ my $self = shift;
if ($self->safe) {
$self->code_setting(@_);
} else {
$self->_setting(@_);
}
}
} |
sub size
{ shift->{stat}->[7]; } |
sub smart_features
{ my $self = shift;
my $d = $self->{smart_features};
$self->{smart_features} = shift if @_;
$d; } |
sub split_group
{ my $group = shift;
$group =~ s/\\;/$;/g; $group =~ s/( \"[^\"]*);([^\"]*\")/$1$;$2/g;
my @groups = split(/\s*;\s*/,$group);
foreach (@groups) { s/$;/;/g }
my ($gclass,$gname,$tstart,$tstop,@notes);
foreach (@groups) {
my ($tag,$value) = /^(\S+)\s*(.*)/;
$value =~ s/\\t/\t/g;
$value =~ s/\\r/\r/g;
$value =~ s/^"//;
$value =~ s/"$//;
if ($tag eq 'Note') { push @notes,$value;
}
elsif ($tag eq 'Target' && $value =~ /([^:\"]+):([^\"]+)/) {
($gclass,$gname) = ($1,$2);
($tstart,$tstop) = /(\d+) (\d+)/;
}
elsif (!$value) {
push @notes,$tag; }
else {
($gclass,$gname) = ($tag,$value);
}
}
return ($gclass,$gname,$tstart,$tstop,\@notes);
}
} |
sub style
{ my $self = shift;
my $type = shift;
my $config = $self->{config} or return;
my $hashref = $config->{$type} or return;
return map {("-$_" => $hashref->{$_})} keys %$hashref; } |
sub type2label
{ my $self = shift;
my $type = shift;
$self->{_type2label} ||= $self->invert_types;
my @labels = keys %{$self->{_type2label}{$type}};
wantarray ? @labels : $labels[0] } |
sub types
{ my $self = shift;
my $features = $self->{features} or return;
return keys %{$features}; } |
General documentation
get_feature_stream(), top_SeqFeatures(), all_SeqFeatures() | Top |
Provided for compatibility with older BioPerl and/or Bio::DB::GFF
APIs.
Appendix -- Sample Feature File | Top |
# file begins
[general]
pixels = 1024
bases = 1-20000
reference = Contig41
height = 12
[Cosmid]
glyph = segments
fgcolor = blue
key = C. elegans conserved regions
[EST]
glyph = segments
bgcolor= yellow
connector = dashed
height = 5;
[FGENESH]
glyph = transcript2
bgcolor = green
description = 1
Cosmid B0511 516-619
Cosmid B0511 3185-3294
Cosmid B0511 10946-11208
Cosmid B0511 13126-13511
Cosmid B0511 11394-11539
EST yk260e10.5 15569-15724
EST yk672a12.5 537-618,3187-3294
EST yk595e6.5 552-618
EST yk595e6.5 3187-3294
EST yk846e07.3 11015-11208
EST yk53c10
yk53c10.3 15000-15500,15700-15800
yk53c10.5 18892-19154
EST yk53c10.5 16032-16105
SwissProt PECANEX 13153-13656 Swedish fish
FGENESH Predicted gene 1 1-205,518-616,661-735,3187-3365,3436-3846 Pfam domain
FGENESH Predicted gene 2 5513-6497,7968-8136,8278-8383,8651-8839,9462-9515,10032-10705,10949-11340,11387-11524,11765-12067,12876-13577,13882-14121,14169-14535,15006-15209,15259-15462,15513-15753,15853-16219 Mysterious
FGENESH Predicted gene 3 16626-17396,17451-17597
FGENESH Predicted gene 4 18459-18722,18882-19176,19221-19513,19572-19835 Transmembrane protein
# file ends
Lincoln Stein <lstein@cshl.org>.
Copyright (c) 2001 Cold Spring Harbor Laboratory
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself. See DISCLAIMER.txt for
disclaimers of warranty.