How to Make a Crappy CRISPR Library

In addition to the pooled CRISPR libraries we offer, there are a few other libraries that researchers can choose to use for gene knockout screens, such as the Broad Institute’s GeCKO and Brunello libraries available through Addgene. To conserve the limited stock of these libraries, many labs only distribute a small fraction of the amplified library and expect the receiving lab to re-amplify the library again in bacteria to make their own stock. However, for this process to produce a usable CRISPR library, three factors are key:

  1. It is essential to start with a complete and representative aliquot of the library for re-amplification.
  2. Enough E. coli must be transformed to ensure that all the plasmids in the aliquot of the library are picked up by cells and amplified so that the final re-amplified library faithfully represents the original.
  3. The transformed bacteria must be cultured in such a way as to prevent competition so that each bacterial cell generates a large population of progeny that have its construct.

Theoretically, a re-amplified library should be a reasonably accurate reproduction of the original library if the above requirements are meant. However, in practice, re-amplification of large complex libraries produces libraries with poorer distribution than the original primary amplified library and often significant sequences will be lost with the process.

We have recently seen this re-amplification problem with GeCKO libraries from two different labs that utilized our Sequencing Service. The GeCKO library is distributed in 1 ug aliquots (20 ul of 50ng/ul) that need to be re-amplified to generate a library suitable for lentiviral packaging and screening. In both cases, the customers provided us genomic DNA samples from their CRISPR screens and aliquots of the re-amplified plasmid library that they used to generate packaged lentiviral particles to transduce the cells for their screens. These plasmid library samples provide a baseline measurement of the representation of each sgRNA in the pre-transduced library. One of the labs sent us a GeCKO library that they had re-amplified themselves from the original 1 ug aliquot they purchased. The second lab, however, purchased a library pre-packaged as lentiviral particles from the distributor, and was provided with an aliquot of the re-amplified library used for packaging by that supplier.

After sequencing both of these orders, it became clear that the distribution of sgRNA in the re-amplified libraries used for the screens was substandard (see Figures below). In both cases, the libraries were missing ca. 1,200 sgRNA from the original library. Since the missing sgRNA sequences are the same in both libraries, it seems likely that these sgRNA were missing in the starting 1 ug aliquot of the GeCKO library rather than lost during re-amplification. However, another 800 sgRNA in one library, and 3,000 sgRNA in the other library were very poorly represented with less than 100 counts each in the normalized results. These differences were clearly introduced during re-amplification, and, in light of the differences, it is fair to conclude that the re-amplification procedure for one of the libraries was performed somewhat better than the other.

The read count distributions of the bulk of the sgRNA constructs in both re-amplified GeCKO libraries, however, were significantly broader than in the original published GeCKO library. Further, the distribution in the sample from the lab that purchased the pre-packaged GeCKO library (the one with the 3,000 poorly represented sgRNAs) was so skewed that it was not useable as a baseline to analyze the results from the screened samples, so the screen was comprised.

Cellecta does not recommend ever re-amplifying our pooled libraries, especially the larger ones. It is not a simple procedure and can really compromise a screen from the start. Considering the time and effort required to run a pooled screen with a large complex pooled library, use of a re-amplified poorly characterized library is a significant risk. We never re-amplify any of our custom or pre-made libraries. All CRISPR and shRNA libraries that we sell have been only been amplified once, just after ligation, and we provide enough material to our customers to ensure they do not have to re-amplify. The several hundred micrograms of plasmid library that we provide from the primary library amplification enables researchers to go directly to the packaging step.

In summary, here are some tips to keep in mind to prevent ending up with a crappy CRISPR library:

  1. Avoid re-amplifying the library by starting with sufficient material to not have to do this.
  2. If you do have to re-amplify, check the distribution of the sgRNAs to ensure that you still have a reasonable baseline distribution that will give you quality data in a knockout screen.
  3. Enlist the help of a partner with expertise in CRISPR libraries to reduce your risk of wasting valuable time and samples.

Re-amplified GeCKO libraries from 2 different labs Figure 1. Distribution of re-amplified GeCKO libraries from two different laboratories.

In Figure 1 above, the number of reads, normalized to 40M total for each run, are indicated on the x-axis, and the number of sgRNA on the y-axis. Each bar indicates the number of different sgRNA for each read count.

You can see the 1,200 sgRNA sequences missing from both library distributions as indicated by the bar at the 10 and under read counts. Although not so obvious in the figures, the number of low-count sgRNA (100 counts or less) for the upper sample is significantly higher (3,000 sgRNA) than the sample in the lower panel (800 sgRNA). The overall distribution in the lower panel is clearly somewhat better than the upper-panel sample, where there are ca. 30,000 sgRNA with counts below 1000 and over 15,000 sgRNA with ca. 2,000+ counts. This bi-modal distribution with roughly a fifth of the library highly over-represented can ruin a screen (as it did in this case) since these overrepresented sequences dominate the data. Neither of these libraries would pass Cellecta's QC standards.

Leave a comment

Comments will be approved before showing up.

Also in Cellecta Blog & News

Inducible Cas9 Expression in a Single Lentiviral Vector

Introducing Inducible Cas9 Expression in a Single Lentiviral Vector to make cells capable of high Cas9 expression for a limited time during which CRISPR-mediated targeted rearrangements can occur, and then shut off Cas9 expression for downstream assays with the modified cells.
Read More
Insertion of 10X Genomics' Capture Sequences Does Not Affect HEAT-Tracr sgRNA Efficacy

Perturb-Seq or CROP-Seq screens make use of single-cell RNA-Sequencing in conjunction with a pooled CRISPR library to identify transcriptional changes and, by implication, activation or deactivation of cellular pathways related to phenotypic changes produced by specific sgRNA-mediated gene knockouts.
Read More
Core Population of Cancer Stem Cells Mediates Therapeutic Resistance in Tumors

Researchers at MD Anderson Cancer Center recently used a Cellecta CloneTracker Barcode Library to label patient-derived xenograft (PDX) cells and establish a stable population of aggressive tumorigenic cells with a specific set of barcodes. With this population of barcoded tumorigenic clones, the investigators...
Read More