You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

A jobs array is specified by adding a task range to qsub via the -t flag:

% qsub -t 1-100 model.job

 

Your job-array NNNNNN.1-100:1 ("model.job") has been submitted

The scheduler will start 100 jobs, each starting the job script file model.job, and pass to each job a task identifier (a number between 1 and 100) via an environment variable.


The syntax for the -t flag is -t n[-m[:s]], namely:

-t 1-20run 20 tasks, with task IDs  ranging from1  to 20
-t 10-30run 21 tasks, with task IDs ranging from 10  to 30
-t 50-140:10run 10 tasks, with task IDs ranging from 50  to 140 by step of 10 (50, 60, ..., 140)
-t 20run 1 task, with task ID 20


Each instantiation of the job will have access to the following four environment variables:

SGE_TASK_IDunique ID for the specific task
SGE_TASK_FIRSTID of the first task, as specified with qsub
SGE_TASK_LASTID of the last task, as specified with qsub
SGE_TASK_STEPSIZEtask ID step size, as specified with qsub


You can also limit the number of concurrent tasks with the -tc flag, for example:

 % qsub -t 1-1000 -tc 100 model.job

will request to run 1,000 jobs, but no more than 100 running at the same time.

Example of a Job Script

The follow example shows how to submit a job array, using embedded directives:

Example of job script to submit a job array
# /bin/csh
#
#$ -N model-1k -cwd -j y -o model-$TASK_ID.log
#$ -t 1-1000 -tc 100
#
echo + `date` $JOB_NAME started on $HOSTNAME in $QUEUE with jobID=$JOB_ID and taskID=$SGE_TASK_ID
#
set TID = $SGE_TASK_ID
./model -id $TID
#
echo = `date` $JOB_NAME for taskID=$SGE_TASK_ID done.
  • This example request to run 1,000 models, using a task ID ranging from 1 to 1000, but limited to 100 running at the same time.
  • It assumes that the model computation is run with the command ./model -id N, where N is a number between 1 and 1,000.
  • This example also show how to use the pseudo variable TASK_ID (not SGE_TASK_ID, yest I agree this is confusing) to give to the log file of each task with a different name: the output of task 123 will be model-123.log in the current working directory.

 

Examples of How to Convert a Task ID to a More Useful Set of Parameters


Example on How to Consolidate Small Jobs in Hewer Larger Jobs When Using Job Arrays

There is some overhead each time the GE starts a a job (or task). So if you need to compute let's say 5,000 similar tasks, each taking 3 minutes, it may be convenient to submit a 5000 tasks job array, but it will be inefficient: the system will spend 25 to 50% of its time starting and keeping track of a slew of small jobs. The following script illustrates  a simple trick to consolidate such computations:

 

Examlpe of job array consolidation wrapper script
# /bin/csh
#
[ add embedded directives here ]
#
# simple wrapper to consolidate using the step size
#
@ i = $SGE_TASK_ID
@ last = $i + $SGE_TASK_STEPSIZE - 1
if ($last > $SGE_TASK_LAST) @ last = $SGE_TASK_LAST
#
echo processing execute.sh from $i to $last
while ($i <= $last)
  ./execute.sh $i
  @ i++
end

 

 

 



Last updated SGK.

 

  • No labels