Categories: "Encode" or "Job Notes"

Submitting to EMBLdb

January 24th, 2011
To submit DNA sequences from capillary (Sanger) sequencing to the public EMBL database, these steps can be take: Webin submission page: -> create 1 submission (Hx2000011028), send the rest in batch… more »

Sequence Contaminations

January 20th, 2011
A contaminated sequence is one that does not faithfully represent the genetic information from the biological source organism/organelle because it contains one or more sequence segments of foreign origin. [NCBI The primary approach to screening nucle… more »

QSEQ File Format

January 6th, 2011
Each record is one line with tab separator in the following format: - Machine name: unique identifier of the sequencer. - Run number: unique number to identify the run on the sequencer. - Lane number: positive integer (currently 1-8). - Tile number:… more »

GENCODE: Generating release files

January 4th, 2011
A. input sources -ensembl core database with gene models, stable ids and xrefs -vega database of same release for id-lookup -3way pseudogene file with gene ids: from Yale, based on pre-dump file from same release -selenocystein file: mysql -… more »

Ensembl Core Database Schema Diagram

November 26th, 2010
To understand the concept of Ensembl and learn how to query the tables I find it extremely useful to have a schema diagram of the database in front of me. This can be generated by using the schema.sql and foreign_keys.sql files from the sql directory… more »