The TRC shRNA Design Process

Overview

We design shRNA constructs ("clones") with an algorithm. Our algorithm uses several criteria to rank potential 21mer targets within each human and mouse Refseq transcript. The algorithm applies a set of rules, including those derived from the siRNA literature, analysis of TRC library performance datasets, constraints on the synthesis and cloning of the oligonucleotides and others. In applying the algorithm, our aim is to achieve a balance of two competing goals: make hairpins that effectively knock down the target transcript and, as best possible, design hairpins that knock down only one gene and do not directly alter other genes (so-called 'off-target' effects). Each goal presents distinct challenges. The criteria for predicting effective knockdown with either siRNA or shRNA are not well understood and are still being developed and refined. Specificity is constrained by genome evolution--since many genes are part of extensive gene families, targeting a specific family member can be difficult. Furthermore, functionally distinct genes share many motifs with underlying nucleic acid sequence similarity. Our knowledge of transcript structure and variants is still very incomplete as well. For all these reasons and more, we construct several shRNAs for each transcript with the expectation of getting a range of knockdown efficiencies across the set and at least a few which knockdown effectively.

Users of this database should be aware that in order to have consistent and reliable annotation, the RNAi Consortium decided early on to use NCBI's REFSEQ collection of transcripts as the definitive source of information for the primary target sequence for the design of shRNAs.

As a general rule in the construction of the library, we construct shRNAs targeting just one Refseq transcript for each NCBI gene. Because of the high sequence identity among different transcripts from the same gene, the majority of the shRNAs target all known transcript variants.

A brief narrative of the candidate selection process

Current Rule Set

Rule Set 9

Rule Description
1 aaStart9 Exclude any candidate beginning with AA (score = 0)
2 fourRow9 Exclude any candidate containing a run of four of the same base in a row (score = 0)
3 gcScore9 Exclude candidates with extreme GC percentage (GC <= 25% or > 60%); promote candidates with GC between 25-55% (score = 3); if GC > 55% and <= 60% then score = 1 (neutral)
4 nonGATC9 Exclude any candidate containing ambiguous bases (e.g. N) (score = 0)
5 restrictionSite9 Exclude any candidate containing certain restriction sites: ...GGTACC..., ...GAATTC..., ...CTCGAG..., ...CATATG..., ...ACTAGT..., ...GGTAC, ...GAATT, GTACC..., TACC..., CTAGT...
6 sevenGC9 Exclude any candidate with a run of 7 C/G bases (score = 0)
7 stemLoopStem Penalize candidates that can form an internal stem-loop (score = 0.1) (minimum stem length = 5, minimum loop size = 4)
8 threePrimeClamp6 Give precedence to candidates with weaker base-pairing at positions 15-20 (priority on pos. 17-19); score = 5 if all 6 positions are A or T, decreasing to 0.1 if all 6 are G/C. Score drops off steeply as the number of A/T bases decreases.

Previous Rule Sets

Rule Set 8

Rule Description
1 aaStart Penalize candidates beginning with AA (score = .000000000000001)
2 fourRow Penalize candidates containing four of the same base in a row gets (score = 0.01)
3 gcScore8 Penalize candidates with extreme GC percentage (GC <= 25% or > 60%; score = 0.01); promote candidates with GC between 25-55% (score = 3); if GC > 55% and <= 60% then score = 1 (neutral)
4 nonGATC Penalize candidates containing an ambiguous base (e.g. N) (score = 0.000000000000001)
5 restrictionSite8 Penalize any candidate containing certain restriction sites: ...GGTACC..., ...GAATTC..., ...CTCGAG..., ...CATATG..., ...ACTAGT..., ...GGTAC, ...GAATT (score = 0.0001)
6 sevenGC Penalize candidates containing a run of 7 C or G (score = 0.01)
7 stemLoopStem Penalize candidates that can form an internal stem-loop (score = 0.1) (minimum stem length = 5, minimum loop size = 4)
8 threePrimeClamp6 Give precedence to candidates with weaker base-pairing at positions 15-20 (priority on pos. 17-19); score = 5 if all 6 positions are A or T, decreasing to 0.1 if all 6 are G/C. Score drops off steeply as the number of A/T bases decreases.

Rule Set 7

Rule Description
1 aaStart Penalize candidates beginning with AA (score = .000000000000001)
2 fivePrimeClamp fivePrimeClamp:give precedence to a candidates with stronger base-pairing at the 5 prime end of the putative candidate, referred to as five_prime_clamp; penalty/reward .01 if first two positions are GG, .0001 if first two are TT; 2.5 if first four are (G|C){4}; 2.4 if first three positions are G|C{3}; 2.2 if begins (CC|CG|GC)(A|T)(G|C); 2 if begins (CC|CG|GC); 2 if begins (GC); 1.25 if begins (G|C); 1 if begins (A|T)(G|C); .5 if begins ((A|T){2}
3 fourRow Penalize candidates containing four of the same base in a row gets (score = 0.01)
4 gcScore gcContent: extremes of GC percentage are penalized; candidates with GC \< 30% are penalized .01; with > 70% the penalty is .01; with GC between 30-50% the candidate gets a reward of 3; with GC >60 and \<70% the reward/penalty is 1
5 internalAT internalAT; we want to reward moderately AT rich regions from 7 through 10; if all four are A|T, rewards is 2.2; if 3 of 4 are A|T, the reward is 2, if 2 of 4 is A|T, the reward is 1.5; if 1 or 4 is A|T, the penalty is .7; if none of the four are A|T, the penalty is 0.5
6 internalATFlanking internalATflank; we want to reward moderately AT-rich sequences at position 6 and 11; if both are AT, the reward is 1.2; if 1 is either A|T, the reward is 1 and if neither is A|T, the penalty is 0.85
7 internalLoop internalLoop: we penalize candidates that cand form a AAABBB loop with a 0.7 penalty
8 nonGATC Penalize candidates containing an ambiguous base (e.g. N) (score = 0.000000000000001)
9 restrictionSite GCCGGC, CCCGGG, CTCGAG, ...GCCGG
10 sevenGC Penalize candidates containing a run of 7 C or G (score = 0.01)
11 threePrimeClamp6 Give precedence to candidates with weaker base-pairing at positions 15-20 (priority on pos. 17-19); score = 5 if all 6 positions are A or T, decreasing to 0.1 if all 6 are G/C. Score drops off steeply as the number of A/T bases decreases.

Rule Set 4

Rule Description
1 aaStart Penalize candidates beginning with AA (score = .000000000000001)
2 fivePrimeClamp fivePrimeClamp:give precedence to a candidates with stronger base-pairing at the 5 prime end of the putative candidate, referred to as five_prime_clamp; penalty/reward .01 if first two positions are GG, .0001 if first two are TT; 2.5 if first four are (G|C){4}; 2.4 if first three positions are G|C{3}; 2.2 if begins (CC|CG|GC)(A|T)(G|C); 2 if begins (CC|CG|GC); 2 if begins (GC); 1.25 if begins (G|C); 1 if begins (A|T)(G|C); .5 if begins ((A|T){2}
3 fourRow Penalize candidates containing four of the same base in a row gets (score = 0.01)
4 gcScore gcContent: extremes of GC percentage are penalized; candidates with GC \< 30% are penalized .01; with > 70% the penalty is .01; with GC between 30-50% the candidate gets a reward of 3; with GC >60 and \<70% the reward/penalty is 1
5 internalAT internalAT; we want to reward moderately AT rich regions from 7 through 10; if all four are A|T, rewards is 2.2; if 3 of 4 are A|T, the reward is 2, if 2 of 4 is A|T, the reward is 1.5; if 1 or 4 is A|T, the penalty is .7; if none of the four are A|T, the penalty is 0.5
6 internalATFlanking internalATflank; we want to reward moderately AT-rich sequences at position 6 and 11; if both are AT, the reward is 1.2; if 1 is either A|T, the reward is 1 and if neither is A|T, the penalty is 0.85
7 internalLoop internalLoop: we penalize candidates that cand form a AAABBB loop with a 0.7 penalty
8 nonGATC Penalize candidates containing an ambiguous base (e.g. N) (score = 0.000000000000001)
9 sevenGC Penalize candidates containing a run of 7 C or G (score = 0.01)
10 threePrimeClamp threePrimeClamp: give precedence to a candidates with weaker base-pairing at the 3 prime end of the putative candidate; penalty/reward 5 if last three positions are A or T, 4.5 if last two are A|T and third from is G|C and fourth is A|T; 4 if the last two are A|T; 2 if the last base is A|T; penalty is .2 if last two posisitions are G|C; .5 if the last base is G|C; 0.8 if the last base is G|C and previous two are A|T

GPP Web Portal Terms of Service

Effective Date: December 8, 2025
By using this site, you agree to our terms and conditions below.

Overview of Terms

The data made available on this website were generated for research purposes and are not intended for clinical or commercial uses. Commercial use (or other use for profit-making purposes) of the GPP Web Portal and its tools, is not permitted under these terms and may require a separate license agreement from Broad or its contributors. For more information, please contact partnering@broadinstitute.org.

The original data may be subject to rights claimed by third parties, including but not limited to, patent, copyright, other intellectual property rights, biodiversity-related access and benefit-sharing rights. It is the responsibility of users of Broad Institute services to ensure that their use of the data does not infringe any of the rights of such third parties.

Any questions or comments concerning these Terms of Use can be addressed to: legal@broadinstitute.org.

By accessing and viewing this GPP Web Portal, you agree to the following terms and conditions:

Attribution

You agree to acknowledge the Broad Institute (e.g., in publications, services or products) for any of your use of its online services, databases or software in accordance with good scientific practice. You agree to use the acknowledgment wording provided for the relevant tools as indicated on the FAQ for each tool.

Updating the Terms of Use

We reserve the right to update these Terms of Use at any time. When alterations are inevitable, we will attempt to give reasonable notice of any changes by placing a notice on our website, but you may wish to check each time you use the website. The date of the most recent revision will appear on this, the "GPP Web Portal Terms of Use" page. If you do not agree to these changes, please do not continue to use our online services. We will also make available an archived copy of the previous Terms of Use for comparison.

Indemnification and Disclaimer of Warranties

You are using this GPP Web Portal at your own risk, and you hereby agree to hold Broad and its contributors and their trustees, directors, officers, employees, and affiliated investigators harmless for any third party claims which may arise from your use of the GPP Web Portal, the tools available therein, or any portion thereof. Further, you agree to indemnify Broad, its contributors, and its and their trustees, directors, officers, employees, affiliated investigators, students, and affiliates for any loss, costs, claims, damages, or other liabilities arising from any unpermitted commercial or profit-making use you make of the GPP Web Portal. The GPP Web Portal is a research tool and is provided "as is". Broad does not represent that the GPP Web Portal is free of errors or bugs or suitable for any particular tasks.

ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR OTHER DEFECTS ARE DISCLAIMED. IN NO EVENT SHALL BROAD OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THE GPP WEB PORTAL, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Governing Law

The terms and conditions herein shall be construed, governed, interpreted, and applied in accordance with the internal laws of the Commonwealth of Massachusetts, U.S.A. Furthermore, by accessing, downloading, or using the Database, You consent to the personal jurisdiction of, and venue in, the state and federal courts within Massachusetts with respect to Your download or use of the Database.