September 2nd, 2009
awk is an extremely useful unix tool for quick command-line task, in particular in combination with other commands like grep or sort. "AWK is a data-driven programming language designed for processing text-based data, either in files or data streams. It is an example of a programming language that extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions." [wikipedia] more »

The GTF Format

July 9th, 2009
GTF stands for Gene transfer format. It borrows from GFF, but has additional structure that warrants a separate definition and format name. The current version is 2.2. Structure is as GFF, so the fields are: Code … more »

The SRF Format

July 2nd, 2009
SRF (Sequence Read Format) is a generic and flexible container format for sequencing and next-generation sequencing files. Format working group: It's the preferred format for the submission of sequencing results to archives… more »

Next-Gen Sequence-Submissions to the ENA

June 3rd, 2009
New Sequencing results are submitted to the European Read Archive (ERA) - now called European Nucleotide Archive (ENA) which collaborates with the NCBI Short Read Archive (SRA). Documentation (EBI) General guidelines (NCBI) Meta data hierarchy:… more »

HTML Codes

March 17th, 2009
