How to use the GPP sgRNA Designer (CRISPRa/i)

Back to the sgRNA Design Tool

Jump to section:

Inputs:

This tool supports inputs in the form of gene IDs or official gene symbols, or of specific genomic coordinates. You may provide sgRNA targets either by pasting them directly into the text box or by uploading a file. In either case, you are limited to 200 Gene IDs or symbols.

Candidate sequences for CRISPRa/i design are selected from specific regions of genomic sequence relative to a gene's transcription start site (TSS) according to the direction. of transcription (i.e. strand). The coordinates currently in use are:

CRISPRa
-300 to 0 bases from TSS
CRISPRi
-50 to +300 bases from TSS

These coordinate ranges are also used in defining the tiers in the off-target "threat matrix" columns of the output file. Other than this differing TSS region distinction there is no difference in functionality between the CRISPRa and CRISPRi versions of the tool.

Alternatively, if you want to design guides for precise genomic coordinates (rather than relying on an annotated transcript or gene site), you can input them directly into the web form if you follow a strict format for CRISPRa/i applications. For example:

NC_000001.11:+:5000

indicates a TSS at position 5000 in the plus ('+') strand orientation on human Chr1 (NC_000001.11). The reason we need to know the strand orientation is that the computed target regions for CRISPRa and CRISPRi are asymmetric around the TSS as described above.

Examples:

Output:

This tool produces two output files. The primary output is a downloadable .txt file containing sgRNAs designed against the appropriate genomic regions relative to the Transcription Start Sites (TSS) of the input genes. The tool also produces a summary file giving statistics for on- and off-target scores by pick order.

Output File Column Descriptions:

Input
Target gene or transcript ID as originally requested.
Quota
Desired number of candidate sgRNA sequences to pick for this target.
Target Taxon
Taxon of the target gene.
Target Gene ID
ID of the target gene (usually numeric, e.g. "Entrez-Gene" ID).
Target Gene Symbol
Official symbol of target gene.
Target Alias
Not used for CRISPRa/i.
PAM Policy
Currently limited to NGG only.
Off-Target Match Rule Set Version ("CFD score")
Method of calculating an off-target match, or "CFD" (Cutting Frequency Determination) score; currently there is only one off-target rule set ("1").
Off-Target Tier Policy
Method used to categorize off-target matches into "Tiers"; currently there is only one such policy ("1"), which classifies an off-target match position according to the CRISPRa/i TSS-relative regions described above: Tier I: positions inside the TSS-relative region of a protein-coding gene; Tier II: positions inside the TSS-relative region of a non-coding gene; Tier III: positions not contained in any gene's TSS-relative region.
Off-Target Match Bin Policy
Thresholds used to categorize off-target matches into "Match Bins" according to CFD score. There are four bins notated by three thresholds in increasing numerical order, separated by periods. Threshold values are in hundredths. For example, "5.20.100" represents the following 4 bins: Bin I: CFD = 1.0, Bin II: 1.0 > CFD ≥ 0.2, Bin III: 0.2 > CFD ≥ 0.05, Bin IV: CFD < 0.05.
Reference Sequence
ID of the reference sequence (e.g. chromosome) containing the TSS.
TSS Position
The annotated location of the TSS within the above reference sequence.
Strand of Target
The strand (+ or -) from which the target gene is transcribed. This defines the orientation of the TSS-relative target regions.
sgRNA 'Cut' Position (1-based)
1-based position of the base in the reference sequence of the "cut site" of this candidate sgRNA. For consistency of annotation with corresponding CRISPRko designs, we use the same position as that would be the cut site if DNA cutting were involved.
Strand of sgRNA
The absolute strand (+ or -) of the sgRNA sequence within the chromosome.
Orientation
The orientation (sense or antisense) of the sgRNA sequence with respect to the target gene's strand.
sgRNA Sequence
The sequence of the candidate sgRNA.
sgRNA Context Sequence
The longer context sequence used for on-target efficacy scoring.
PAM Sequence
Sequence of the PAM for this match.
sgRNA 'Cut' Site TSS Offset
The position of the "cut site" in relation to the TSS, given as a positive or negative offset.
Other Target Matches
Perfect on-target matches to genes other than the current target gene, if any.
# Off-Target Tier I Match Bin I Matches
Count of all off-target matches for Tier I and Match Bin I.
# Off-Target Tier II Match Bin I Matches
Count of all off-target matches for Tier II and Match Bin I.
# Off-Target Tier III Match Bin I Matches
Count of all off-target matches for Tier III and Match Bin I.
# Off-Target Tier I Match Bin II Matches
Count of all off-target matches for Tier I and Match Bin II.
# Off-Target Tier II Match Bin II Matches
Count of all off-target matches for Tier II and Match Bin II.
# Off-Target Tier III Match Bin II Matches
Count of all off-target matches for Tier III and Match Bin II.
# Off-Target Tier I Match Bin III Matches
Count of all off-target matches for Tier I and Match Bin III.
# Off-Target Tier II Match Bin III Matches
Count of all off-target matches for Tier II and Match Bin III.
# Off-Target Tier III Match Bin III Matches
Count of all off-target matches for Tier III and Match Bin III.
# Off-Target Tier I Match Bin IV Matches
Count of all off-target matches for Tier I and Match Bin IV.
# Off-Target Tier II Match Bin IV Matches
Count of all off-target matches for Tier II and Match Bin IV.
# Off-Target Tier III Match Bin IV Matches
Count of all off-target matches for Tier III and Match Bin IV.
On-Target Rule Set
Model used for calculating "On-Target" efficacy score (currently the only supported version is "Azimuth_2.0", which is the updated version of "Rule Set 2" from Doench, Fusi et al., Nature Biotechnology 2016). See Azimuth 2.0.
On-Target Efficacy Score
Actual on-target score of the context sequence for this candidate sgRNA as calculated using the on-target rule set.
DHS Score
Score ranging from 0 to 1 (1 is highest) indicating whether target sequence is within a known ENCODE-annotated DNase I Hypersensitive Site.
On-Target Rank
Numerical rank (1 is highest) of this candidate sgRNA's On-Target score in relation to all other candidates for this target.
Off-Target Rank
Numerical rank (1 is highest, i.e. most-specific) of this candidate sgRNA's Off-Target evaluation in relation to all other candidates for this target.
On-Target Rank Weight
When combining On-Target and Off-Target rankings into one Combined Rank, use this weight for the On-Target Rank.
Off-Target Rank Weight
When combining On-Target and Off-Target rankings into one Combined Rank, use this weight for the Off-Target Rank.
Combined Rank
Numerical rank (1 is highest) of this candidate sgRNA based on the weighted sum of On-Target and Off-Target ranks.
Pick Order
If this candidate is picked to fulfil the target's quota, what order was it picked.
Picking Round
Candiate picking is complex and may go through multiple rounds of relaxing constraints, this column indicates in which round the pick occurred.
Picking Notes
This column indicates reasons why the construct was skipped during picking. It is empty if the candidate was picked during round 1.

Back to the sgRNA Designer for CRISPRa/i

Frequently Asked Questions:

Q: I would like to relax one of your pick restrictions. How can I do that?

In the future we plan to give users more flexibility in adjusting some of the picking criteria. In the meantime, the default report shows all possible guides for a target, and users should be able to sort and filter results based on their own set of criteria using the information contained in the various annotation columns.

Q: Your report contains 50+ columns! Which are the most important?

The "Target Gene" column will presumably be necessary for all applications of the tool. If you are using this tool to fill a per-target quota (the current default is 5), then the "Pick Order" column reflects the final decision of the tool and incorporates all other rank, score and positional columns. Advanced users may want to ignore the "Pick Order" but make use of e.g. the "CRISPRa/i Effect Pos TSS Offset", "On-Target Efficacy Score" and the 12 "Off-Target Match" columns directly.

Q: What are my options if I still want to design sgRNAs against an arbitrary genomic position?

If you want to design guides for precise genomic coordinates (rather than relying on an annotated transcript or gene site), you can input them directly into the web form if you follow a strict format for CRISPRa/i applications. See the Inputs section above for more information including examples.

Q: My gene is not found, or there are problems finding a transcript for my gene. What can I do?

You can still use the tool if you can determine the exact genomic coordinates for your region of choice. See above.

Q: I have heard the term "Threat Matrix" used describing some of the columns in the results. What is the "Threat Matrix"?

This informal term is used internally to describe the 12 columns in the CRISPRa/i result report that summarize the number of off-target hits arranged by CFD scores and Match Tier Bins, starting with the column headed "# Off-Target Tier I Match Bin I Matches". See the information listed above in the Output section and on our sgRNA Scoring Help Page for more information about these columns and what they mean.

Q: The columns describing the number of off-target hits contain the value "MAX". What does this mean?

If the total number of discovered off-target hits for a particular sgRNA sequence exceeds 10,000, we abort the search and report the value "MAX" in all off-target count columns instead. NOTE: This does not mean that any given column has a count exceeding 10,000, merely that the total of all 16 columns exceeds 10,000 by an unknown amount.

Q: How is the TSS determined for gene inputs?

TBA

Back to the sgRNA Designer for CRISPRa/i