Please use this identifier to cite or link to this item:
Scopus Web of Science® Altmetric
Type: Journal article
Title: Scalable and cost-effective NGS genotyping in the cloud
Author: Souilmi, Y.
Lancaster, A.
Jung, J.
Rizzo, E.
Hawkins, J.
Powles, R.
Amzazi, S.
Ghazal, H.
Tonellato, P.
Wall, D.
Citation: BMC Medical Genomics, 2015; 8(1):64-1-64-9
Publisher: BioMed Central
Issue Date: 2015
ISSN: 1755-8794
Statement of
Yassine Souilmi, Alex K. Lancaster, Jae-Yoon Jung, Ettore Rizzo, Jared B. Hawkins, Ryan Powles, Saaïd Amzazi, Hassan Ghazal, Peter J. Tonellato and Dennis P. Wall
Abstract: Background: While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10’s of dollars. Results: We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Conclusions: Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.
Keywords: Next-generation sequencing; clinical sequencing; cloud computing; medical genomics; software; bioinformatics; parallel computing
Rights: © 2015 Souilmi et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.
RMID: 0030055603
DOI: 10.1186/s12920-015-0134-9
Appears in Collections:Genetics publications

Files in This Item:
File Description SizeFormat 
hdl_107995.pdfPublished Version1.14 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.