Guoguang Zhao, Dechao Bu, Changning Liu, Jing Li, Jian Yang, Zhiyong Liu, Yi Zhao, Runsheng Chen. CloudLCA: finding the lowest common ancestor in metagenome analysis using cloud computing[J]. Protein&Cell, 2012, 3(2): 148-152. doi: 10.1007/s13238-012-2015-8
Citation: Guoguang Zhao, Dechao Bu, Changning Liu, Jing Li, Jian Yang, Zhiyong Liu, Yi Zhao, Runsheng Chen. CloudLCA: finding the lowest common ancestor in metagenome analysis using cloud computing[J]. Protein&Cell, 2012, 3(2): 148-152. doi: 10.1007/s13238-012-2015-8

CloudLCA: finding the lowest common ancestor in metagenome analysis using cloud computing

  • Estimating taxonomic content constitutes a key problem in metagenomic sequencing data analysis. However, extracting such content from high-throughput data of next-generation sequencing is very time-consuming with the currently available software. Here, we present CloudLCA, a parallel LCA algorithm that significantly improves the efficiency of determining taxonomic composition in metagenomic data analysis. Results show that CloudLCA (1) has a running time nearly linear with the increase of dataset magnitude, (2) displays linear speedup as the number of processors grows, especially for large datasets, and (3) reaches a speed of nearly 215 million reads each minute on a cluster with ten thin nodes. In comparison with MEGAN, a well-known metagenome analyzer, the speed of CloudLCA is up to 5 more times faster, and its peak memory usage is approximately 18.5% that of MEGAN, running on a fat node. CloudLCA can be run on one multiprocessor node or a cluster. It is expected to be part of MEGAN to accelerate analyzing reads, with the same output generated as MEGAN, which can be import into MEGAN in a direct way to finish the following analysis. Moreover, CloudLCA is a universal solution for finding the lowest common ancestor, and it can be applied in other fields requiring an LCA algorithm.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return