| Type |
File |
Last Changed |
Target |
Match |
Description |
1 |
Reference[?]Reference File Columns:
- Barcode Sequence
- Construct ID
- (Optional) Target Sequence
|
CP0058_reference_20170301.csv |
2017-03-01 |
n/a |
n/a |
Lists all 20mer barcodes contained in the pool, with their associated construct ID(s), if any. Used during initial sequence deconvolution, e.g. with PoolQ. |
2 |
CHIP[?]"gene" CHIP File Columns:
- Barcode Sequence
- Gene Symbol
- Gene ID
"trans" CHIP File Columns:
- Barcode Sequence
- Transcript
- Gene Symbol
- Gene ID
|
CP0058_17mer_20181214.chip |
2018-12-14 |
gene |
lax[?]"lax" CHIP filesLax CHIP files are generated using less stringent criteria
to determine whether a perturbagen sequence matches the reference target:
Lax shRNA CHIP files:
- Use only the initial 19mer of the shRNA target sequence to determine "perfect" transcriptome matches.
Lax sgRNA CHIP files:
- Use only the final 17mer (for S. pyogenes Cas9) or 18mer
(for S. aureus Cas9) of the sgRNA target sequence to assess "perfect" matches to genomic PAM sites.
- Allow 10 bases of "slop" around target genomic feature loci (e.g. intronic sequence not ordinarily
considered to be a match to a coding sequence).
- May consider multiple feature definitions where available (e.g. TSS annotation from FANTOM and NCBI
or Ensembl may differ significantly -- in lax mode, a match near either site for a gene would be considered a hit).
|
(LEGACY) 20mer barcode to gene mapping via lax[?]"lax" CHIP filesLax CHIP files are generated using less stringent criteria
to determine whether a perturbagen sequence matches the reference target:
Lax shRNA CHIP files:
- Use only the initial 19mer of the shRNA target sequence to determine "perfect" transcriptome matches.
Lax sgRNA CHIP files:
- Use only the final 17mer (for S. pyogenes Cas9) or 18mer
(for S. aureus Cas9) of the sgRNA target sequence to assess "perfect" matches to genomic PAM sites.
- Allow 10 bases of "slop" around target genomic feature loci (e.g. intronic sequence not ordinarily
considered to be a match to a coding sequence).
- May consider multiple feature definitions where available (e.g. TSS annotation from FANTOM and NCBI
or Ensembl may differ significantly -- in lax mode, a match near either site for a gene would be considered a hit).
match to CDS regions only. |
3 |
CHIP[?]"gene" CHIP File Columns:
- Barcode Sequence
- Gene Symbol
- Gene ID
"trans" CHIP File Columns:
- Barcode Sequence
- Transcript
- Gene Symbol
- Gene ID
|
CP0058_GRCm38_NCBI_lax_gene_20220624.chip |
2022-06-24 |
gene |
lax[?]"lax" CHIP filesLax CHIP files are generated using less stringent criteria
to determine whether a perturbagen sequence matches the reference target:
Lax shRNA CHIP files:
- Use only the initial 19mer of the shRNA target sequence to determine "perfect" transcriptome matches.
Lax sgRNA CHIP files:
- Use only the final 17mer (for S. pyogenes Cas9) or 18mer
(for S. aureus Cas9) of the sgRNA target sequence to assess "perfect" matches to genomic PAM sites.
- Allow 10 bases of "slop" around target genomic feature loci (e.g. intronic sequence not ordinarily
considered to be a match to a coding sequence).
- May consider multiple feature definitions where available (e.g. TSS annotation from FANTOM and NCBI
or Ensembl may differ significantly -- in lax mode, a match near either site for a gene would be considered a hit).
|
20mer barcode to gene mapping via lax[?]"lax" CHIP filesLax CHIP files are generated using less stringent criteria
to determine whether a perturbagen sequence matches the reference target:
Lax shRNA CHIP files:
- Use only the initial 19mer of the shRNA target sequence to determine "perfect" transcriptome matches.
Lax sgRNA CHIP files:
- Use only the final 17mer (for S. pyogenes Cas9) or 18mer
(for S. aureus Cas9) of the sgRNA target sequence to assess "perfect" matches to genomic PAM sites.
- Allow 10 bases of "slop" around target genomic feature loci (e.g. intronic sequence not ordinarily
considered to be a match to a coding sequence).
- May consider multiple feature definitions where available (e.g. TSS annotation from FANTOM and NCBI
or Ensembl may differ significantly -- in lax mode, a match near either site for a gene would be considered a hit).
sgRNA sequence match to NCBI-annotated genes in GRCm38 primary assembly. |
4 |
CHIP[?]"gene" CHIP File Columns:
- Barcode Sequence
- Gene Symbol
- Gene ID
"trans" CHIP File Columns:
- Barcode Sequence
- Transcript
- Gene Symbol
- Gene ID
|
CP0058_GRCm38_NCBI_strict_gene_20220624.chip |
2022-06-24 |
gene |
strict[?]"strict" CHIP filesStrict CHIP files are generated using the most stringent criteria
to determine whether a perturbagen sequence matches the reference target:
Strict shRNA CHIP files:
- Use the full 21mer shRNA target sequence to determine perfect transcriptome matches.
Strict sgRNA CHIP files:
- Use the full 20mer (for S. pyogenes Cas9) or 21mer
(for S. aureus Cas9) sgRNA target sequence to assess perfect matches to genomic PAM sites.
- Only consider matches where the "cut site" is properly contained in or touching (at extreme edge of)
a target genomic feature loci.
- Consider only the "best" feature definition where multiple are available (e.g. TSS annotation
from FANTOM and NCBI or Ensembl may differ significantly -- in strict mode, we restrict matches
to the single highest confidence annotated locus).
|
(PREFERRED) 20mer barcode to gene mapping via strict[?]"strict" CHIP filesStrict CHIP files are generated using the most stringent criteria
to determine whether a perturbagen sequence matches the reference target:
Strict shRNA CHIP files:
- Use the full 21mer shRNA target sequence to determine perfect transcriptome matches.
Strict sgRNA CHIP files:
- Use the full 20mer (for S. pyogenes Cas9) or 21mer
(for S. aureus Cas9) sgRNA target sequence to assess perfect matches to genomic PAM sites.
- Only consider matches where the "cut site" is properly contained in or touching (at extreme edge of)
a target genomic feature loci.
- Consider only the "best" feature definition where multiple are available (e.g. TSS annotation
from FANTOM and NCBI or Ensembl may differ significantly -- in strict mode, we restrict matches
to the single highest confidence annotated locus).
sgRNA sequence match to NCBI-annotated genes in GRCm38 primary assembly. |
5 |
CHIP[?]"gene" CHIP File Columns:
- Barcode Sequence
- Gene Symbol
- Gene ID
"trans" CHIP File Columns:
- Barcode Sequence
- Transcript
- Gene Symbol
- Gene ID
|
CP0058_origtarget_20191021.chip |
2019-10-21 |
gene |
n/a |
20mer barcode to gene mapping using construct's originally intended target gene, if known. |
6 |
BED[?]BED File Columns:
- Sequence ID (chromosome)
- Zero-based starting position where the sgRNA is aligned to the target
- Zero-based ending position where the sgRNA is aligned to the target
- ID of the construct that contains the sgRNA
- A score computed as
max(0, (1000 - 10(N - 1)))
where N is the total number of matches to the genome for the
construct. This means a construct with only one match to the genome
gets the maximum score of 1000, while constructs with >100 matches
receive the lowest score of 0. A construct with 2 total matches
receives a score of 990, etc.
- Strand of the match site at this location
- thickStart (n/a - required part of bed file format)
- thickEnd (n/a - required part of bed file format)
- itemRGB - an rgb triplet specifying the color to use when displaying this match in a .bed file viewer.
Blue for distinct matches, red for 2-100 matches, and gray for more than 100 matches
|
CP0058_10090_GRCm38_20200617_ontarget.bed |
2020-06-17 |
n/a |
n/a |
Genome annotation track data in the UCSC BED format, with all perfect sgRNA target sequence (20mer) matches at PAM sites in target assembly GRCm38. |