You can limit how many jobs you submit with the following trick:
Code Block language bash title How to limit the number of jobs submitted, using C-shell syntax
# define how many jobs to queue @ NMAX = 250 # loop: @ N = `qstat -u $USER | tail --lines=+3 | wc -l` if ($N >= $NMAX) then sleep 180 goto loop endif #
This example counts how many jobs you have in the queue (running and waiting) using
the command qstat(and
wc -l) and pauses for 3 minutes (180 seconds) if that count is 250 or higher.
You would include these lines in a script that submits a slew of jobs, but should not queue more than a given number at any time (to count only the queued jobs, add
- Or you can use the tool
q-wait(needs the module
tools/local), that takes an argument and two options:
% q-wait blah
will pause until you have no job whose name has the string '
blah' left queued or running.
- The options allow you to specify the number of jobs, and how often to check, i.e.:
% q-wait -N 125 -wait 3600 crunch
will pause until there are 250 or fewer jobs whose name has the string '
crunch' left queued or running, checking once an hour.
- Avoid using the
-Vflag passes all the active environment variables to the script.
- While it may be convenient in some instances, it creates a dependency on the precise environment configuration when submitting the job,
thus the same job script may fail when it is submitted at a later time (or from a different log in) from a different configuration.
[all examples need to be validated for Hydra-5]
You can find examples of simple/trivial test cases with source code,
Makefile, job script files and resulting log files under
- : the job script generator. [QSubGen has yet to be adjusted to reflect the new memory reservation rule]
What Queue Should I Use?
To choose a queue, you need to know
- You may need to combine PE, memory and CPU resource requests.
- Remember, that the more resources your job requests, the fewer concurrent similar jobs can run at any time.
- Similar jobs will need similar resources, so when in doubt and before queuing a slew of similar jobs:
- run one job and monitor its resource usage, then
- queue the other jobs after trimming the requested resources (CPU and memory).
The local+ tool
check-jobs.plallows allows you to check the resources consumed by jobs that have completed.
Use the command
qconf -srqs or
qquota, see how to check under resource limits.
The local tool
check-qwait allows you to visualize the queue quota resources and which jobs are waiting.
Last Updated SGK