Category: "Bioinformatics"

GC content of human chromosomes

April 16th, 2013
The GC content is the molar ratio of guanine+cytosine bases in DNA. The human genome is a mosaic of GC-rich and GC-poor regions, of around 300kb in length, called isochores. GC content is an important factor in many experiments and bioinformatic analysi… more »

SAM format summary

August 30th, 2012
The SAM Format is a text format for storing sequence data in a series of tab delimited ASCII columns very common in next-generation sequencing data processing. It is the (non-binary) human-readable version of the BAM format. more »

Data Processing with Biopieces

August 2nd, 2012
There is a fine set of scripts that form an orderely pipeline (or framework) to process bioinformatics data on the Unix command line called biopieces. more »

Building Config Files from a Skeleton

July 12th, 2012
To run programs or pipelines automatically it is often necessary to create or adjust configuration files. Ideally this should be done dynamically by a script from a skeleton (layout) file, replacing placeholder with the adjusted values. more »

Analysing Variation with Ensembl and PolyPhen

May 28th, 2012
The Ensembl variation resources provide information about structural variants and sequence variants (including Single Nucleotide Polymorphisms (SNPs), insertions, deletions and somatic mutations in the human genome. more »