Assembler Comparisons

Over the past few months, several members of the GRC bioinformatics team have been working diligently on testing a variety of assemblers and analyzing results.  The assembler testing is intended to help critically evaluate the results/performance of some of the more popular de novo assemblers.  Similar studies have been done before (such as: http://gage.cbcb.umd.edu/), but we aim to expand upon those studies by testing on different organisms and data types.  To that end, WGS data generated at IGS, from many samples and across multiple species (such as E. coli, V. cholera, S. aureus and M. massiliense), have been assembled at multiple coverage levels using assemblers such as Celera Assembler, MSRCA, Velvet, SOAPdenovo and ABySS.  In addition, the data has been sequenced using various NGS platforms, including Illumina HiSeq, Illumina MiSeq and PacBio.  These data types will be assembled in different combinations and as stand-alone assemblies to gauge the affects of hybrid assemblies of different data types and combinations.  We hope to have lots of stats compiled in the very near future.