Few Simple Trinity Tips
After over 24 hours of Trinity run on a huge RNAseq library, we were greeted with this message -
It is actually not what you are thinking. We were initially fooled by the message and thought the program ran out of RAM. Actually the program crashed, because Trinity shell commands use absolute paths instead of relative paths during its operation. While Trinity was at its 15th hour of running Meryl, we happened to move the topmost folder to make sure the large files do not crowd out backup system. That little effort to speed up the execution delayed it by hours. When Meryl completed, Inchworm started and immediately crashed, because it could not find any file.
What is the best way to run Trinity by skipping Meryl or Inchworm (because they already completed), or run it step by step? Trinity command line does not give us any option, and we are too intellectually challenged to go through the manuals or code. We found an easy way by running Trinity on a short library and capturing the screen output in a file. Screen output generated by Trinity is very detailed, and prints complete commands with all parameters for Inchworm, Chrysalis and Butterfly steps.
The above suggestion may come handy, if you like to run your own k-mer counting code before using Trinity for the remaining steps. You will have to print the output in meryl format, and then tell Inchworm and other steps that you used meryl.
meryl output format:
377
AAATGTGCTGGTAAA
44
AAATTTGCTGGTAAA