Another Group Finds Similar Keys to Optimal Pooled shRNA Library Screens

Our group recently ran across an article describing an independent RNAi screen with a non-Cellecta pooled shRNA expression library that piqued our interest. In the October 2011 online Genome Biology Journal, Sims, et al. comprehensively described how to run a rigorous genome-wide pooled RNA interference screen using next generation sequencing. The article thoroughly describes the procedural steps involved in screening a heterogeneous pooled library of thousands of lentiviral shRNA expression constructs. Although they used a library somewhat different than our design (the lack of unique sequenceable barcodes being one notable difference), the study nicely demonstrates many of the requirements to ensure meaningful screening results and emphasizes the need to use high throughput next-generation sequencing (as opposed to microarray hybridization) for reproducible measurements of shRNA depletion or enrichment following selection.

Viability or "drop-out" screens that look for depletion of shRNA sequences in selected populations to identify essential genes are one of the most common applications of pooled shRNA screening. The Sims et al. study focuses primarily on the key factors to ensure reproducible results for these screens. Among the most important ones, they note the following:

  1. The shRNA expression library itself must be generated systematically to minimize variation in hairpin representation. This should be assessed by HT sequencing of the plasmid form of the library. Interestingly, Sims et al. also found that the plasmid library is a better reference for starting hairpin representation than the pseudoviral packaged library, which is consistent with our experience at Cellecta, too.
  2. It is essential to manage cell numbers to maintain hairpin representation through the whole screen. Specifically, Sims et al. recommends maintaining at least 1,000 cells per RNA–which is also the ratio we find optimal as described in an earlier blog post. They also caution against letting cells grow past 70% confluency before replating.
  3. Following selection, it is important to amplify sufficient genomic DNA to ensure a representative population from each cell sample. For their library of 10,000 shRNAs, they used at least 60ug of genomic DNA for pre-sequencing PCR amplification. We too find similar amounts necessary (i.e., for 27,000 shRNA, we us 200 ug/sample).
  4. Biological replicates are a requirement to overcome stochastic noise inherent in the screen. However, replicates should have a high level of reproducibility with R-squared values of 0.9 or better.
  5. The pooled shRNA library must be a reasonable size to enable practical handling of the cell populations, genomic DNA amplification, and biological replicates required for an effective screen. Sims et al used a library with 10,000 shRNA.

As a result of the thorough technique, Sims et al. estimated they were able to identify more than 98% of the hairpins in all replicates. One distinct difference in the Sims et al. library compared with Cellecta's is the presence of an barcode, that is, a unique readily identifiable sequence separate from the hairpin sequence that can be used to identify the particular shRNA in the expression cassette. Somewhat confusingly, though, Sims et al. used the term "barcode screen" although no barcode is present in their library. Detection of shRNA levels in selected populations was done by sequencing a portion of the shRNA encoding region. From our experience, use of a separate unique barcode optimized for sequence analysis increases sequencing calls and helps improve replicate correlations. Sims et al. did find that the pre-sequencing PCR step introduced a certain amount of noise in the data, which is consistent with amplification variability of shRNA sequences as opposed to short standardized barcodes.

The consistency of the general findings of this independent study with our experience, however, is very encouraging. Using a similar but distinct library, Sims et al. have uncovered many of the same critical requirements for optimal screening with complex shRNA pools as we have. This alignment emphasizes the importance of these procedural details to obtaining meaningful screening results, and it provides additional support for RNAi screening standards of practice that was the topic of the previous post.

Sims et al. also mentioned the development of two open source programs for computational analysis of pooled shRNA screening results–shALIGN and shRNAseq.

Please email with any comments.

Also in Cellecta Blog & News

Perturb-Seq Screening: Cell-by-Cell Analysis of Gene Perturbations Induced by Pooled CRISPR sgRNA Libraries

Read More
Gene Expression Profiling of Single-Cell Samples: DriverMap Targeted Expression Profiling vs SMART Technology

Single-cell expression analysis provides insights about gene expression and cell heterogeneity at the single-cell level. It enables the elucidation of intracellular gene regulatory networks and intracellular pathways that would otherwise be masked in bulk analysis (Massaia et al., 2018). The DriverMap™ Targeted Gene Expression Profiling (TXP) assay combines highly multiplexed RT-PCR amplification with the depth and precision of Next-Generation Sequencing (NGS) to quantitatively measure gene expression of up to 19,000 target genes in a single assay–even down to the single-cell level.
Read More
Comparing DNA vs. RNA Samples for Immune Repertoire Profiling

Adaptive immunity relies on B and T cells that recognize foreign antigens via hypervariable B cell and T cell receptors (BCRs and TCRs). Diversity among B cell and T cell receptors is primarily produced by V(D)J recombination, which involves the shuffling and joining of the variable (V), diversity (D), joining (J), and constant region (C) gene segments. This results in a diverse repertoire called the adaptive immune repertoire (AIR) that comprises multiple individual clonotypes (sequence) for particular receptor chains.
Read More