khmer protocols

version:0.8.5 (unreleased)

This is a set of protocols for doing genomic data analysis – specifically, de novo mRNAseq assembly and de novo metagenome assembly – in the cloud.

The latest released version of these protocols can always be found at:

If you need to reference these protocols, please cite:

Brown, C. Titus; Sheneman, Leigh; Scott, Camille; Crusoe, Michael;
Rosenthal, Josh; Howe, Adina Chuang (2013): khmer-protocols

Helpful instructions:


mRNAseq assembly: the Eel Pond Protocol

The Escambron mRNAseq Protocol

This is a lightweight protocol for assembling up to a few hundred million mRNAseq reads, annotating the resulting assembly, and doing differential expression with RSEM.

Metagenome assembly: the Kalamazoo Protocol

The Kalamazoo Metagenome Assembly protocol

This is a protocol for assembling low- and medium-diversity metagenomes. Marine sediment and soil data sets may not be assemblable in the cloud just yet.

Additional information

Need help? Either post comments on the bottom of each page, OR sign up for the mailing list.

Have you used these protocols in a scientific publication? We’ll have citation instructions up soon.


khmer-protocols development has largely been supported by AFRI Competitive Grant no. 2010-65205-20361 from the USDA NIFA, and Award Number R25HG006243 from the National Institutes of Health, both to C. Titus Brown. We now have continuing support from the National Human Genome Research Institute of the National Institutes of Health under Award Number R01HG007513, also to C. Titus Brown.

CTB’s work on the Eel Pond mRNAseq tutorial was enabled by his 2013 summer research work at the Marine Biological Laboratory, funded by the Burr and Susie Steinbach Award and the Laura and Arthur Colwin Endowed Summer Research Fellowship Fund


LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github.
comments powered by Disqus