The SPAdes assembler is often used for microbial and organelle genome assembly as well as target enrichment methods such as UCE or exon capture.
Specifying memory
the -m <int>
option for spades.py
should be used to specify the amount of memory (in GB) you have reserved in your qsub
submission (mres
). The default value of -m
if you do not specify a value is 250. By specifying a value for -m
that matches what you reserved, SPAdes will adjust buffer size usage to reduce RAM usage.
SPAdes temporary files
spades.py
can create many temporary files, more that one million in some cases. By default, the temporary files are placed in a tmp
directory in the output directory you specify with -o
. The large number of files can cause you to exceed your inode quota or adversely affect the operation of the GPFS ( /scratch
et al.) and NetApp ( /pool
et al.).
To avoid this, it is recommended to use the local SSD space for the SPAdes temporary files.
This examples requests 200GB of local SSD from the scheduler (-l ssd_res=200G
) and then uses this for the spades.py
tmp directory by adding --tmp-dir $SSD_DIR
The addition of -v SSD_SAVE_MAX=0
directs the job scheduler to not save any of the files from the SSD storage space. The temporary files spades.py
creates do not need to be retained.
# /bin/sh # ----------------Parameters---------------------- # #$ -S /bin/sh #$ -pe mthread 8 #$ -q lThM.q #$ -l mres=96G,h_data=12G,h_vmem=12G,himem #$ -l ssd_res=200G -v SSD_SAVE_MAX=0 #$ -cwd #$ -j y #$ -N spades #$ -o spades.log # # ----------------Modules------------------------- # module load bio/spades module load tools/ssd # ----------------Your Commands------------------- # # echo + `date` job $JOB_NAME started in $QUEUE with jobID=$JOB_ID on $HOSTNAME echo + NSLOTS = $NSLOTS spades.py \ -o sample \ --pe1-1 sample_R1_PE_trimmed.fastq.gz \ --pe1-2 sample_R2_PE_trimmed.fastq.gz \ --tmp-dir $SSD_DIR \ -m 96 \ -t $NSLOTS # echo = `date` job $JOB_NAME done
Last updated MPK