We are excited to announce that our PacBio Sequel System has arrived! The newest instrument from PacBio joins the RS II system which has been with us since 2011. The Sequel uses redesigned SMRT Cells containing more than one million ZMWs (zero mode waveguides), a significant increase over the 150,000 ZMWs on RS II cells. The increased capacity of the cells means a potential for up to 6x as many reads per run compared to the RS II system. Over the past several weeks, we have begun evaluating and optimizing the Sequel. While initial supplies of the new SMRT Cells are limited, we expect improved cells and the next chemistry release in the coming months. Our goal is to offer sequencing on the Sequel as part of our services portfolio this summer.
We are pleased to announce that the GRC is the first PacBio certified service provider on the East Coast. This recently announced program is a partnership between PacBio and select sequence providers who have completed the certification process and offer the highest quality sequencing and analysis services using the PacBio technology. We offer a full range of PacBio services, including whole genome sequencing, transcriptome sequencing via Iso-Seq, targeted amplicon sequencing, and other customized applications. Our analysis team has expertise in genome assembly and annotation, variant analysis, transcriptome analysis, and base modification detection. We look forward to continuing our strong partnership with PacBio and offering the highest quality sequencing and analysis to our customers and collaborators.
As part of this new partnership, the GRC is proud to co-sponsor the SMRTest Microbe Grant Program. One lucky winner will receive sequencing and analysis services from the GRC. To enter, submit a short grant application detailing your project and how it would benefit from the long reads and high consensus accuracy of SMRT Sequencing. The deadline for submissions is June 27, 2015.
This year we are highlighting some of the work we’ve done in the past year.
The first poster provides an overview of how changes to our PacBio pipeline have increased our sequencing yields and read lengths, resulting in finished, high-quality microbial genomes, assembled using only PacBio data.
The second poster demonstrates how Next Gen sequencing can be used to investigate host and pathogen associations in cases of pulmonary non-tuberculous mycobacterial (PNTM) infections.
The GRC, which offers services from sequencing library prep through genome assembly and downstream analysis, is generating complete bacterial genome sequences and methylation profiles using PacBio SMRT sequencing on the RS II. Several advancements in the library prep, sequencer, sequencing protocols, and data analysis software have all contributed to this.
To learn more about these breakthroughs and other emerging applications of SMRT sequencing, please read the PacBio Core Lab Profile showcasing the research performed at GRC and IGS here.
GRC and IGS offer not only cutting-edge sequencing, but a complete menu of services including assembly, annotation, and custom analyses. For more information about services offered, visit our Laboratory Services and Analysis Services pages. Please contact us if you have any questions.
Although the latest SMRTcell has been designed to shift the loading bias towards larger read lengths, when working with long insert libraries (10-20 kb), the preferential loading of smaller fragments often limits the potential of these libraries.
A solution to this is to remove small fragments from the libraries. We have evaluated the Blue Pippin (Sage Science, Inc., Beverly MA), an automated electrophoresis system that separates and collects DNA fragments based upon their size, for this purpose.
In order to measure the increase in subread length, long insert libraries were prepared with fragments larger than 4 kb or 7 kb isolated using the Blue Pippin and a 0.75% Agarose Gel Cassette (BLF7510) and compared to a library without Blue Pippin size selection. As shown below, the removal of smaller library fragments prior to sequencing increases the average length of the library fragments loaded into ZMWs on the SMRTcell.
In addition to longer subreads, there is also a boost to the amount of data generated per ZMW. As the fragment length increases, the percentage of SMRTbell adapter sequence decreases and the percentage of library insert increases. The graph below shows the average number of passed-filter bases per active ZMW versus the average fragment length of each library. Using Blue Pippin size selection, we have achieved yields of >500 M passed filter bases from individual SMRTcells.
Below are the sequencing and assembly results of four genomes sequenced from long-insert, Blue Pippin size-selected libraries. Using only PacBio long subread data, we were able to assemble complete microbial genomes for three of the four isolates. Even with only a single under-loaded and low-yield SMRTcell, the remaining isolate still resulted in a nearly complete genome assembly with 10 total contigs and >60% of the genome assembled in the largest contig.
Our PacBio throughput and read lengths have been improving steadily over the past year and may have just taken yet another big step forward. We upgraded our PacBio sequencer to RSII in mid-May and we are seeing significant increases in per-cell yield and improved read lengths with our longer libraries. The most notable change in the upgrade from RSI to RSII is the doubling of the number of simultaneously observable sequencing reactions on the SMRTcell, allowing throughput to be effectively doubled as well. Let’s take a look at some examples:
In this comparison of an 8kb Mycobacterium library that was run both before and after the upgrade, we see an almost 3x increase in total yield per-SMRTcell, while read lengths remain about the same.
Below is a comparison of per-SMRTcell stats from multiple libraries across multiple organisms, including both 8kb and 14kb libraries from Mycobacterium sp., Plasmodium falciparum, Saccharomyces cerevisiae and Candida albicans. Driven by the longer libraries, we see both dramatically higher yield and longer read lengths. On one recent 8 SMRTcell run of a 14kb library, we saw an average per-SMRTcell yield of 417 Mbp!
Here is a read length plot comparing the runs from the table above:
Although we are early in our use and optimization of the new PacBio RSII, we are encouraged by the increase in both yield and read length, and expect continued improvement in our PacBio data, subsequently improving data analysis and genome assembly.
16S amplicon sequencing has proven to be an important tool for identifying and quantifying microbes present in metagenomic samples. We have several researchers here at IGS who have used this to analyze organismal and environmental communities for several years.
Together with these researchers, the GRC has been working over the past year to transition high-throughput sequencing of 16S rRNA regions amplified from metagenomic samples from the 454 platform to the Illumina platform. With the increased read length (2x250bp) on the MiSeq, it is now well suited to generate 16S data for a fraction of the cost of generating data on the 454 FLX.
A typical 16S amplicon run on the 454 produces ~1M reads with an average read length of ~500 bp, which enables deep profiling of 100-200 samples. A paired-end MiSeq run generates 500 bp of sequence per amplicon and produces an average of 12M read pairs per run. We are now routinely profiling a minimum of 400 samples per run with even greater depth than possible on 454 for less than half the per-sample cost.
Please contact us for more information about our 16S profiling service using the Illumina MiSeq.
At AGBT a couple of weeks ago, I presented a poster with an overview of methods developed by GRC members to sequence and assemble viral genomes from clinical samples. To view the poster, follow the link below:
IGS also presented a poster about custom capture at this year’s AGBT meeting. The poster below presents data demonstrating that custom capture can be an effective way to sequence entire genomes of obligate intracellular parasites that cannot be grown independently, including such organisms isolated form field samples.
A new feature that was added with the recent PacBio upgrade is something called ‘Stage Start’. This allows for data collection to start earlier than it did previously. When this option is used, data collection begins immediately after the polymerase is activated, resulting in longer reads.
Below are the results from a quick test we performed. We sequenced two libraries with and without the ‘Stage Start’ feature turned on.
The libraries sequenced were about 8kb in length, and were sequenced using the Magbead Standard Seq v1 protocol. One 90-minute movie was taken of each SMRTcell. Standard Polymerase Binding and Sequencing kits were used (not the newer ‘XL’ version of the kits).