HiSeq 4000 Update

We are excited to announce that our HiSeq 4000 is here and running. We are off to the races! The run time for two flow cells is down to 3.5 days, which will allow us to complete projects faster than ever. Each dual flowcell run can sequence up to 12 human genomes, 180 exomes, or 100 transcriptomes.

Our first flow cell, a PE150 run, produced 2.9 billion clusters and more than 850 Gb of data.

Here are some basic stats from our first flow cell:

Lane Total Reads Total Bases %PF %Q20 %Q30
1 727,970,622 109,923,563,922 75.41 94.42 87.53
2 777,363,006 117,381,813,906 80.53 96.27 91.53
3 754,973,540 114,001,004,540 78.21 96.24 91.49
4 753,317,088 113,750,880,288 78.03 95.86 90.36
5 749,170,142 113,124,691,442 77.61 96.21 91.07
6 684,821,518 103,408,049,218 70.94 96.04 90.89
7 665,485,116 100,488,252,516 68.94 95.65 90.25
8 602,875,822 91,034,249,122 62.45 95.17 89.38

 

Do you have a project that could benefit from the combination of increased output and read length the HiSeq 4000 offers? If so, please contact us. We will discuss the details of your project with you and guide you through the process of planning your sequencing experiments.

GRC Spring 2015 Update

The Genomics Resource Center (GRC) continues to expand its capabilities and project portfolio. As part of our contract with the U.S. Food and Drug Administration (FDA) to sequence, assemble, and annotate pathogens in support of the development and expansion of a comprehensive, curated public reference database, we are developing a new pipeline for Ebola virus sequencing and analysis. We have also initiated several new projects to sequence large animal and plant genomes using the Pacific Biosciences platform. These larger projects were made possible by our recent upgrade to the new P6-C4 chemistry. This new chemistry, combined with improved software, has increased read lengths by more than 30% and doubled overall throughput. In June, we will host the Pacific Biosciences East Coast User Group Meeting for the third consecutive year. Please join us to hear about this exciting technology and its expanding applications.

Our Illumina platform continues to improve as well. In April, we will take delivery of our first HiSeq4000. This sequencer, the newest announced by Illumina, will increase throughput by 50% while reducing run time by an additional 50%. Each HiSeq4000 will be capable of sequencing 24 human genomes per week. We have also expanded our MiSeq repertoire with the installation of a MiSeq Dx in our CLIA facility for clinical sequencing applications.

The GRC will be hosting a booth at the annual American Society for Microbiology (ASM) general meeting in New Orleans from May 30 – June 2, 2015. If you’re there, please stop by to visit and learn more about our services and capabilities!

Q&A with the Co-Directors of the GRC

How do I initiate a project with GRC?

It’s easy! Contact us via our website (www.igs.umarylande.edu/grc) or email (grc-info@som.umaryland.edu) and we will set up an initial consultation with you. During this consultation, we will discuss your project goals and expectations and advise on experimental design. From there, we develop a project plan that includes sample requirements, timelines, cost estimates, and deliverabes. For large, long-term projects, we schedule additional discussions to finalize the project plan and monitor progress.

How long does it take? How much will it cost?

These are the most common questions we hear, but often difficult to answer. Depending on the scope and scale of the project, the timeline can vary from a few weeks to months or even years. Similarly, costs can fall in a wide range. We treat each project separately and develop the best estimates of cost and timelines as part of our consultation with each investigator.

Do you offer analysis, or only sequencing?

We do it all – from project design through sequencing and analysis. We have bioinformatics teams specialized in genome assembly, variant analysis, metagenomics, transcriptomics, and epigenomic analysis. If you are interested in analysis, we include that as part of the project consultation and project plan.

Click here to find the full IGS Spring 2015 newsletter as well as previous editions.

Single Molecule Sequencing and Genome Assembly of a Clinical Specimen of Loa loa

Scientists Apply Successful Single Molecule Sequencing and de novo Genome Assembly to a Parasitic Worm that Infects Human Eyes and Skin

Investigators at the Institute for Genome Sciences (IGS) at the University of Maryland School of Medicine and the Laboratory of Parasitic Diseases at the National Institute of Allergy and Infectious Diseases (NIAID) at the National Institutes of Health (NIH) used the long-read, single-molecule Pacific Biosciences platform for the successful genome sequencing and de novo assembly of Loa loa round worms from a clinical sample. Their research, which generated the most complete genome sequence of a filarial nematode produced to date, provides a more comprehensive reference genome for this parasite in the hopes of developing better molecular diagnostics to decrease morbidity from filarial nematodes. Their findings appear in today’s issue of BMC Genomics.

Click here to access the abstract and complete article.

GRC Awarded Contract to Expand FDA Microbial Genome Database

IGS and the GRC have been awarded a contract to assist the U.S. Food and Drug Administration (FDA) in the expansion and curation of a public database of microbial genome sequences and associated metadata. This will serve as a valuable reference to evaluate and assess high-throughput sequencing based diagnostic devices. In addition to all publicly available microbial genome sequences, the database will include more than 550 newly sequenced, assembled, and annotated genomes from under-represented branches of the phylogenetic tree. For more information on the project, please click here or contact the GRC.

GRC Posters Presented at AGBT 2014

This year we are highlighting some of the work we’ve done in the past year.

The first poster provides an overview of how changes to our PacBio pipeline have increased our sequencing yields and read lengths, resulting in finished, high-quality microbial genomes, assembled using only PacBio data.

The second poster demonstrates how Next Gen sequencing can be used to investigate host and pathogen associations in cases of pulmonary non-tuberculous mycobacterial (PNTM) infections.

For more information on our full range of sequencing and analysis services, visit our Laboratory Services and Analysis Services pages. Please contact us if you have any questions.

Highly Multiplexed 16S Sequencing on MiSeq

16S amplicon sequencing has proven to be an important tool for identifying and quantifying microbes present in metagenomic samples. We have several researchers here at IGS who have used this to analyze organismal and environmental communities for several years.

Together with these researchers, the GRC has been working over the past year to transition high-throughput sequencing of 16S rRNA regions amplified from metagenomic samples from the 454 platform to the Illumina platform. With the increased read length (2x250bp) on the MiSeq, it is now well suited to generate 16S data for a fraction of the cost of generating data on the 454 FLX.

A typical 16S amplicon run on the 454 produces ~1M reads with an average read length of ~500 bp, which enables deep profiling of 100-200 samples. A paired-end MiSeq run generates 500 bp of sequence per amplicon and produces an average of 12M read pairs per run. We are now routinely profiling a minimum of 400 samples per run with even greater depth than possible on 454 for less than half the per-sample cost.

Please contact us for more information about our 16S profiling service using the Illumina MiSeq.

Options When Starting Material is Limiting

Sometimes it is not possible to come up with the amount of DNA or RNA required for a standard Illumina library prep. We are frequently asked what the options are when there is just not enough sample available.

There are several kits on the market now that allow Illumina libraries to be prepared from minimal amounts of starting material. We have processed clinical samples, metagenomic samples, and samples from FFPE tissues that yielded extremely low amounts of RNA or DNA.

For RNA samples, we have generated linearly-amplified cDNA with the Nugen Ovation v2 kit. An advantage of this kit is that the amplification of rRNA is somewhat suppressed, increasing the percentage of usable data. Starting with sub-nanogram amounts of RNA, we are able to generate micrograms of cDNA. We’ve tested various library preparation methods with the amplified cDNA, and we have found that the Illumina TruSeq prep to work the best for us.

The Illumina Nextera system is an option available when DNA amounts are limiting. The Nextera XT DNA Sample Prep Kit requires exactly 1 ng of input material (best for plasmids or small genomes), and the Nextera kit DNA Sample Prep Kit requires exactly 50 ng of DNA. The library fragmentation is accomplished via transposon insertion events. We skip the normalization/denaturation portion of the protocol, and determine the quality and quantity of the libraries following our standard procedures. We have found that the library sizes tend to vary, and can be much wider than our traditional Illumina DNA libraries, but this is still a great option when there is very little material available.

Contact us if you have questions or would like additional information.

AGBT Whole Genome Capture Poster

IGS also presented a poster about custom capture at this year’s AGBT meeting.  The poster below presents data demonstrating that custom capture can be an effective way to sequence entire genomes of obligate intracellular parasites that cannot be grown independently, including such organisms isolated form field samples.

MiSeq: More reads and longer read lengths are now available

Over the past couple of months we have been evaluating the MiSeq upgrade. This upgrade includes the ability to sequence longer reads (250nt from each end, so 500 nt per library fragment) and to collect data from more clusters (both the top and bottom of each channel are imaged). We just had a 250 PE run that exceeded 30 million reads – that is just over 8 Gbases of data! This is a nice jump up from the ~13M reads (~2 Gbases) per run we were getting before.

Disclaimer: We are still in data-gathering mode to determine what the average expectation should be for each run- it would be nice to get 30 million reads from every run, but that may not happen.

Here are quality plots of a 250 PE run with genomic PE libraries. We are working to maximize quality as the read lengths increase.

Read 1:

Read 2:

We can now combine the benefits of Illumina’s high read counts with the benefits of longer reads.

Up next is to see how MiSeq/Pac Bio hybrid assemblies measure up to HiSeq/454 hybrid assemblies, and a comparison of assemblies using HiSeq 100bp PE reads vs MiSeq 250 bp PE reads.