Category: "Sequencing"

SNP calling & the VCF format

May 16th, 2014
SNP calling refers to the process of identifying posititions where the genome of a sequenced sample differs to that of the reference genome. This might lead to finding disease-causing genomic alterations. In the following I wanted to re-align short NGS… more »

SAM format summary

August 30th, 2012
The SAM Format is a text format for storing sequence data in a series of tab delimited ASCII columns very common in next-generation sequencing data processing. It is the (non-binary) human-readable version of the BAM format. more »

Data Processing with Biopieces

August 2nd, 2012
There is a fine set of scripts that form an orderely pipeline (or framework) to process bioinformatics data on the Unix command line called biopieces. more »

Sequence Mappability & Alignability

May 16th, 2012
Sequence uniqueness and the possbility to map sequence parts (e.g. NGS short reads) back to a genomic region... The CRG Alignability tracks display how uniquely k-mer sequences align to a region of the genome. For each window (of sizes 36, 40, 50, 75… more »

QSEQ File Format

January 6th, 2011
Each record is one line with tab separator in the following format: - Machine name: unique identifier of the sequencer. - Run number: unique number to identify the run on the sequencer. - Lane number: positive integer (currently 1-8). - Tile number:… more »