In addition to the pooled CRISPR libraries we offer, there are a few other libraries that researchers can choose to use for gene knockout screens, such as the Broad Institute’s GeCKO and Brunello libraries available through Addgene. To conserve the limited stock of these libraries, many labs only distribute a small fraction of the amplified library and expect the receiving lab to re-amplify the library again in bacteria to make their own stock. However, for this process to produce a usable CRISPR library, three factors are key:
Theoretically, a re-amplified library should be a reasonably accurate reproduction of the original library if the above requirements are meant. However, in practice, re-amplification of large complex libraries produces libraries with poorer distribution than the original primary amplified library and often significant sequences will be lost with the process.
We have recently seen this re-amplification problem with GeCKO libraries from two different labs that utilized our Sequencing Service. The GeCKO library is distributed in 1 ug aliquots (20 ul of 50ng/ul) that need to be re-amplified to generate a library suitable for lentiviral packaging and screening. In both cases, the customers provided us genomic DNA samples from their CRISPR screens and aliquots of the re-amplified plasmid library that they used to generate packaged lentiviral particles to transduce the cells for their screens. These plasmid library samples provide a baseline measurement of the representation of each sgRNA in the pre-transduced library. One of the labs sent us a GeCKO library that they had re-amplified themselves from the original 1 ug aliquot they purchased. The second lab, however, purchased a library pre-packaged as lentiviral particles from the distributor, and was provided with an aliquot of the re-amplified library used for packaging by that supplier.
After sequencing both of these orders, it became clear that the distribution of sgRNA in the re-amplified libraries used for the screens was substandard (see Figures below). In both cases, the libraries were missing ca. 1,200 sgRNA from the original library. Since the missing sgRNA sequences are the same in both libraries, it seems likely that these sgRNA were missing in the starting 1 ug aliquot of the GeCKO library rather than lost during re-amplification. However, another 800 sgRNA in one library, and 3,000 sgRNA in the other library were very poorly represented with less than 100 counts each in the normalized results. These differences were clearly introduced during re-amplification, and, in light of the differences, it is fair to conclude that the re-amplification procedure for one of the libraries was performed somewhat better than the other.
The read count distributions of the bulk of the sgRNA constructs in both re-amplified GeCKO libraries, however, were significantly broader than in the original published GeCKO library. Further, the distribution in the sample from the lab that purchased the pre-packaged GeCKO library (the one with the 3,000 poorly represented sgRNAs) was so skewed that it was not useable as a baseline to analyze the results from the screened samples, so the screen was compromised.
Cellecta does not recommend ever re-amplifying our pooled libraries, especially the larger ones. It is not a simple procedure and can really compromise a screen from the start. Considering the time and effort required to run a pooled screen with a large complex pooled library, use of a re-amplified poorly characterized library is a significant risk. We never re-amplify any of our custom or pre-made libraries. All CRISPR and shRNA libraries that we sell have been only been amplified once, just after ligation, and we provide enough material to our customers to ensure they do not have to re-amplify. The several hundred micrograms of plasmid library that we provide from the primary library amplification enables researchers to go directly to the packaging step.
In summary, here are some tips to keep in mind to prevent ending up with a crappy CRISPR library:
Figure 1. Distribution of re-amplified GeCKO libraries from two different laboratories.
In Figure 1 above, the number of reads, normalized to
You can see the 1,200 sgRNA sequences missing from both library distributions as indicated by the bar at the 10 and under