Long-Range Genomics is Reviving a Lost Art

Featured Article

 Long-Range Genomics is Reviving a Lost Art

Next-generation sequencing (NGS) tools helped bring sequencing into the high-throughput realm, but suffer from the limitations of short reads. With read lengths restricted to just a few hundred bases, it is nearly impossible for users to resolve structural variants, repetitive DNA, and other large or complex genomic elements. Single-molecule sequencing, however, has enabled long reads.

Long-read sequencing can be performed with instruments that routinely generate multi-kilobase reads, with some scientists producing reads that are hundreds of kilobases long. Genome-wide mapping systems provide chromosome-scale information that can be used to order and orient contigs for higher-quality genome assemblies. Other long-range technologies make synthetic long reads from short-read data, while proximity ligation methods allow users to map distant elements throughout chromosomes.

Never before has it been so straightforward to analyze extremely long stretches of DNA. The discoveries emerging from these methods enable new views of clinically relevant structural variation—the first real understanding of the importance of DNA folding patterns, even patterns of distant variants working in tandem.

A significant technical limitation still exists, however. These long-range genomic methods require the input of high-molecular-weight (HMW) DNA. From sequencing to mapping to ligation, systems function best with extremely large DNA fragments. Unfortunately, after a decade of needing only to prepare short DNA fragments, the ability to handle HMW DNA has become a lost art. Years ago, this was a common task in labs that worked with fosmids, recombinant BACs, and Southern blots. Today, these sample-preparation techniques are unfamiliar to many scientists, and those who do remember them know how laborious and time-consuming they were. The opportunities afforded by long-range technologies place enormous pressure on NGS sample prep pipelines that were never meant to handle DNA in the kilobase- or megabase-size range.

The SageHLS, short for HMW Library System (Sage Science, Beverly, MA), was designed to treat DNA gently enough so that users can prepare libraries of extremely large fragments directly from blood samples, bacterial and tissue cultures, or other sources of cells. It performs rapid, automated extraction of large DNA fragments or purification of specific targets with gel electrophoresis. Users wash cells by low-speed centrifugation and resuspend them in an isotonic gel-loading buffer. Once samples are loaded on the gel, the system automatically performs cell lysis, enzyme processing, and contaminant removal. Because genomic DNA fragments are so large—many megabases in length—they cannot be electrophoresed out of the gel. After purification and processing steps are completed, the DNA is treated with a nonspecific nuclease for light, random cleaving. Finally, the fragments, which are now between 50 kb and 2 mb, are retrieved from the gel in an automated electroelution process, after which users can collect them.

Applications and results

The SageHLS platform can extract DNA as long as 2 mb. For most samples, it recovers at least a microgram of DNA from input cell loads of about 10 μg (about 1.5 million human cells), meeting the requirements of most long-range genomics technologies.

Figure 1 – Goat white blood cells were prepared from whole blood, resuspended, and loaded onto a SageHLS gel cassette. Elutions include DNA fragments of 100 kb and longer.

In one study, goat white blood cells were extracted and fragmented on the instrument using a nonspecific nuclease (Fragmentase, New England Biolabs, Ipswich, MA) (see Figure 1).1 Pulsed-field gel analysis of the results indicated that elutions were at least 100 kb.

In addition to extracting very long pieces of DNA from a sample, the system can be used with CRISPR/Cas9 techniques to target and purify specific genomic regions that are too long, repetitive, or variable for traditional enrichment methods. This application relies on CATCH (Cas9-assisted targeted of chromosome segments), a protocol published by Yuval Ebenstein’s lab at Tel Aviv University.2 With CATCH, users only need to know the sequence of a flanking site for their gene or region of interest. The technique uses RNA-guided Cas9 to make two cuts outside of the region of interest, followed by a DNA sizing step to remove smaller or larger fragments that may have been caught in off-target effects (see Figure 2). It was developed to target very large regions, usually at least 50 kb.

Figure 2 – By using the CATCH protocol with the SageHLS instrument, scientists targeted the BRCA1 gene and avoided other DNA elements.

CATCH relies on gel electrophoresis, making it a good fit for the SageHLS instrument. The automated approach allows CATCH users to avoid more laborious options like pulsed-field gel electrophoresis process or manual gels. A collaborative study from Tel Aviv University and the Icahn School of Medicine at Mount Sinai used CATCH with the SageHLS system for E. coli and human samples.3 Custom Cas9 nucleases were used to remove the desired genomic regions. For E. coli, a 200-kb fragment was targeted and verified through sequencing on a MinION device (Oxford Nanopore Technologies, Oxford, U.K.). For humans, the technique was used to purify a 200-kb fragment including the BRCA1 gene, and results were validated with qPCR.

Looking ahead

Advancements in long-range genomics go beyond sequencing, offering information about enormous pieces of a genome. As new approaches unfold, sample preparation, bioinformatics, and downstream interpretation systems will need to be optimized to generate and interpret long-range data.

References

  1. http://www.sagescience.com/wp-content/uploads/2017/05/HLS-flyer-2016-1.pdf
  2. https://www.nature.com/articles/ncomms9101
  3. http://www.sagescience.com/wp-content/uploads/2017/02/AGBT-Boles-2017.pdf

Chris Boles is Chief Scientific Officer, Sage Science, 500 Cummings Center, Suite #2400, Beverly, MA 01915, U.S.A.; tel.: 978-922-1832; e-mail: [email protected]www.sagescience.com

Related Products