This directory contains full annotation files for the Broad GPP ORF inventory. Files marked *-active-* contain clones that are in current use. Files marked *-retracted-* list clones that we can no longer recommend for use, due to questions about quality or identity of the construct. Column Descriptions: Clone ID - our public identifier Vector - vector backbone Entry or Expression? - is this a Gateway Entry or Expression clone? DNA Barcode - if this clone has a unique DNA barcode (e.g. in vector pLX_317) Target-Matching Region Length - the Target-Matching Region is the part of the sample sequence used for matching with the transcriptome. This basically everything that is not constant vector flank, epitope tag, stop codon, barcode, etc. (see the sequence columns to the right). % of Insert Sequenced (Direct) - how much of the TMR sequence we have directly confirmed with sequence data % of Insert Sequenced (Any) - how much of the TMR sequence we have either directly or indirectly (i.e. by sequencing an ancestor clone) confirmed with sequence data. Clone Lineage - traces the parentage of this clone within our labs Lab Processes - what lab processes were involved in the lineage Intended Mutant? - was this ORF designed as a match to some known mutant version of a transcript for experimental purposes? Best Match Taxon ID - the taxon used for the transcriptome matching in subsequent columns Best Match Gene ID - the gene ID of the transcript found to have the best global alignment (via Needleman-Wunsch) with this TMR Best Match Gene Symbol - the gene symbol of the transcript found to have the best global alignment (via Needleman-Wunsch) with this TMR Best Match Transcript - the transcript found to have the best global alignment (via Needleman-Wunsch) with this TMR. If this transcript is coding, the match is the transcript's ORF sequence, minus the stop codon. If this transcript is non-coding, the match is to the entire transcript sequence. Best Match % (Nuc.) - the percentage match (nucleotide alignment) of this TMR to the best match transcript Best Match Variant (Nuc.) - if the percentage nucleotide match of this TMR to the best match transcript < 100%, this is a description in HGVS notation of how this TMR differs from the best match transcript sequence Best Match % (Prot.) - the percentage match (protein alignment) of this TMR to the best match transcript (if coding transcript only) Best Match Variant (Prot.) - if the percentage protein match of this TMR to the best match transcript < 100%, this is a description in HGVS notation of how this TMR differs from the best match protein sequence Best Match is Mutant? - is the best match transcript a known mutant variant transcript or a wildtype transcript? Clone Status - if OK then clone has no known issues Seq. Notes - other notable characteristics of this ORF (e.g. presence of stop codon, epitope tag, etc.) 5' Flank Seq. - vector flanking sequence upstream of the TMR Target-Matching Region Seq. - the TMR sequence 3' Flank Seq. - vector flanking sequence downstream of the TMR (may include stop codon, epitope, barcode) Orig. Annotated Gene Symbol - if the original source materials or intent on creating this clone had a particular target gene annotated, then this is listed here