Category: "Bioinformatics"

Genomic Start Coordinates

October 23rd, 2010
Adding to the confusion about different notations of phases/frames, the start coordinates of genomic features are also noted differently between different genome browsers and file formats. 1. One-based Counting bases starting with "1" at the firs… more »

Repeat Finding and Masking

October 13th, 2010
What are genomic interspersed repeats? [from the RepeatMasker docu] In the mid 1960's scientists discovered that many genomes contain stretches of highly repetitive DNA sequences ( see Reassociation Kinetics Experiments, and C-Value Paradox ). Thes… more »

RNA-Seq data quality scores

February 26th, 2010
There are different way to encode the quality scores in FASTQ files. It is important to know these before using the data and converting between the ways if necessary. Sanger format can encode a [[Phred quality score]] from 0 to 93 using [[ASCII]… more »

Conditional Formatting in Ms Excel

November 12th, 2009
To change the format of a cell based on the content of that or another cell conditional formatting can be used. For simple things and up to three options the dialog "Format"-"Conditional Formatting" can be called after selecting the target cell. You can… more »

Assessing Gene Predictions

November 2nd, 2009
Calculating prediction rate TP = true posisitives (correctly identified) FP = false positives (overpredicions) TN = true negatives (correctly un-called) FN = false negatives (missed) Specificity = TP / (TP + FP) Sensitivity = TP / (TP + FN)… more »