Announcement elPrep open source project
We have just released elPrep, a high-performance tool for preparing SAM/BAM/CRAM files for variant calling in DNA sequencing pipelines. It can be used as a replacement for standard tools, such as SAMtools and Picard, for preparation steps such as sorting, marking duplicates, reordering contigs, and so on, while producing identical results. elPrep is designed as a multi-threaded applications that runs entirely in memory, avoids repeated file I/O, and merges the computations of several preparation steps, to speed up the execution time by an order of magnitude. For example, on a 16-core server, we see a speedup of 10.5x when using elPrep compared to using a combination of SAMtools and Picard. elPrep is also a modular, extensible framework, where users can easily add more preparation steps that automatically take advantage of elPrep’s inherent parallelism and performance.
elPrep has been developed at Imec Belgium at the ExaScience Life Lab (http://www.exascience.com), in collaboration with Intel and Janssen Pharmaceutica (Johnson & Johnson). The application is implemented in Common Lisp, but relies on the symmetric multi-processing features from LispWorks.
The open source release is available at http://github.com/exascience/elprep with full end user documentation. A demo is available at http://github..com/exascience/elprep-demo including test data. The API documentation at http://exascience.github.io/elprep/elprep-package/ provides details about the elPrep framework.
---
Charlotte Herzeel, PhD
Researcher at ExaScience Life Lab (via IMEC)
Address: Kapeldreef 75, 3001 Leuven, Belgium
_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html