========================================= 4. Evaluating your transcriptome assembly ========================================= We will be using Transrate and Busco! .. shell start .. :: set -x set -e Be sure you have loaded the right Python packages :: source ~/venv/bin/activate Transrate ---------- `Transrate `__ serves two main purposes. It can compare two assemblies to see how similar they are. Or, it can give you a score which represents proportion of input reads that provide positive support for the assembly. We will use transrate to get a score for the assembly. Use the trimmed reads. For a further explanation of metrics and how to run the reference-based transrate, see the documentation: http://hibberdlab.com/transrate/metrics.html and the paper by `Smith-Unna et al. 2016 `__. Make a new directory and get the reads together: :: cd ${PROJECT} mkdir -p evaluation cd evaluation cat ${PROJECT}/quality/*R1*.qc.fq.gz > left.fq.gz cat ${PROJECT}/quality/*R2*.qc.fq.gz > right.fq.gz Transrate doesn't like pipes in sequence names. This version of Trinity doesn't output pipes into the sequence names, but others do. Let's just fix to make sure. :: sed 's_|_-_g' ${PROJECT}/assembly/trinity_out_dir/Trinity.fasta > Trinity.fixed.fasta Now, run the actual command:: module load transrate transrate --assembly=Trinity.fixed.fasta --threads=2 \ --left=left.fq.gz \ --right=right.fq.gz \ --output=${PROJECT}/evaluation/nema