Using dbVar

May 12th, 2011
dbVar is the NCBI database of genomic structural variation designed to store data on variant DNA ≥ 1 bp in size. Ids are organised in the following manner: (n|e)std: the study id - this identifies a submitted study (n|e)sv: the stru… more »

Gene Models (and the Central Dogma of Molecular Biology)

October 23rd, 2010
What is a Gene Model? I found the following text on the teaching pages of Prof. Ann Loraine and found it worth repeating (slightly modified) here: Gene models are hypotheses about the structure of transcripts produced by a gene. Like all models, they… more »

Repeat Finding and Masking

October 13th, 2010
What are genomic interspersed repeats? [from the RepeatMasker docu] In the mid 1960's scientists discovered that many genomes contain stretches of highly repetitive DNA sequences ( see Reassociation Kinetics Experiments, and C-Value Paradox ). Thes… more »

ENCODE cell lines

February 24th, 2010
These are some of the cell lines that are used in the various analysis of the ENCODE project. The first two are so-called tier-1 lines and covered by all the different types of experiments within ENCODE, the others are tier-2 lines, additionally there… more »


April 21st, 2008
The Consensus CoDing Sequence (CCDS) project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The long term goal is to support convergence towards a standard… more »