There are different way to encode the quality scores in FASTQ files. It is important to know these before using the data and converting between the ways if necessary.
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS.....................................................
...............................IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII......................
..........................XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
| | | | | |
33 59 64 73 104 126
S - Sanger Phred+33, 41 values (0, 40)
I - Illumina 1.3 Phred+64, 41 values (0, 40)
X - Solexa Solexa+64, 68 values (-5, 62)
Source: wikipedia
You can convert the Solexa read quality to Sanger read quality with Maq:
maq sol2sanger s_1_sequence.txt s_1_sequence.fastq
where s_1_sequence.txt is the Solexa read sequence file. Missing this step will lead to unreliable SNP calling when aligning reads with Maq.
Source: maq-manual
These are some of the cell lines that are used in the various analysis of the ENCODE project. The first two are so-called tier-1 lines and covered by all the different types of experiments within ENCODE, the others are tier-2 lines, additionally there are a number of tier-3 cell lines.
To change the format of a cell based on the content of that or another cell conditional formatting can be used.
Code:
Sub Color_groups() | |
| |
Set MyPlage = Range("A2:A1000") | |
| |
For Each Cell In MyPlage | |
| |
If InStr(1, Cell.Value, "Vic_") Then | |
| |
Cell.Interior.ColorIndex = 3 | |
| |
ElseIf InStr(1, Cell.Value, "Tyl_") Then | |
| |
Cell.Interior.ColorIndex = 4 | |
| |
ElseIf InStr(1, Cell.Value, "Wol_") Then | |
| |
Cell.Interior.ColorIndex = 6 | |
| |
ElseIf InStr(1, Cell.Value, "Sim_") Then | |
| |
Cell.Interior.ColorIndex = 7 | |
| |
ElseIf InStr(1, Cell.Value, "Sea_") Then | |
| |
Cell.Interior.ColorIndex = 8 | |
| |
ElseIf InStr(1, Cell.Value, "Mar_") Then | |
| |
Cell.Interior.ColorIndex = 15 | |
| |
ElseIf InStr(1, Cell.Value, "Lio_") Then | |
| |
Cell.Interior.ColorIndex = 17 | |
| |
End If | |
| |
Next | |
| |
End Sub |
How to avoid falling in the cache...
Caching is a powerful way to speed up queries to the Ensembl database. It can get problematic however for example if you are repeating a query multiple time, but have updated the data set in between. It is important to know how to turn caching off if needed - this is not officially documented though.
To turn the caching off on the mysql server
Code:
my $sa = $reg->get_adaptor($species,"core","slice"); | |
my $sth = $sa->dbc->db_handle->prepare("SET SESSION | |
query_cache_type = OFF"); | |
$sth->execute || die "set session failed\n"; |
Reset caches in Perl API
Code:
sub free_caches{ | |
my $species = shift; | |
my $group = shift; | |
| |
foreach my $adap (@{$registry->get_all_adaptors(-species => | |
$species, -group => $group)}){ | |
$adap->{'_slice_feature_cache'} = undef; | |
| |
if(defined($adap->{'cache'})){ | |
$adap->{'cache'} = undef; | |
} | |
| |
if(defined($adap->{'seq_region_cache'})){ | |
my $seq_region_cache = $adap->{'seq_region_cache'} = | |
Bio::EnsEMBL::Utils::SeqRegionCache->new(); | |
| |
$adap->{'sr_name_cache'} = $seq_region_cache->{'name_cache'}; | |
$adap->{'sr_id_cache'} = $seq_region_cache->{'id_cache'}; | |
} | |
} | |
| |
} |
Source: Ian Longden, EBI
Installing and Running Proserver to serve data via DAS
The Distributed Annotation System (DAS) is an elegant way of sharing data and using data from diverse sources. More information at http://www.biodas.org and on these blog pages. The Proserver is a lightweight software system to provide your data as a DAS source.
Download from http://proserver.svn.sf.net/
or
Code:
svn co https://proserver.svn.sf.net/svnroot/proserver/trunk Bio-Das-ProServer |
Build:
Code:
cd Bio-Das-ProServer | |
perl Build.PL | |
./Build | |
./Build test | |
(optional:) ./Build install |
Run:
Code:
eg/proserver -x -c eg/local.ini |
Adjust the ini file with the source you want to serve, e.g.:
Code:
[otter_das] | |
state = on | |
adaptor = otter_das | |
title = Havana manual annotations | |
description = A DAS source that provides access to the Havana annotation. | |
coordinates = NCBI_36,Chromosome,Homo sapiens => 21:25673390,25733000 | |
dsncreated = 2008-03-11 | |
maintainer = felix@work.ac.uk | |
doc_href = http://www.dasregistry.org/showProjectDetails.jsp?project_id=80 | |
host = otterlive | |
user = username | |
port = 3306 | |
dbname = loutre | |
driver = mysql |
Dependencies to re-install:
Compression libs Bundle-Compress-Zlib, Compress::Zlib, and such (http://search.cpan.org/dist/Compress-Raw-Zlib/lib/Compress/Raw/Zlib.pm) (must match each others versions to avoid errors like does not match bootstrap parameter).