Submitting an array of jobs
A common problem is that you have a large number of jobs to run, and they are largely identical in terms of the command to run. For example, you may have 1000 data sets, and you want to run a single program on them using the cluster. A quick solution is to generate 1000 separate jobscripts, and submit them all to the queue. This is not efficient, neither for you nor for the scheduler master node.
Grid-scheduler allows users to submit a single job with a number of separate tasks; these are scheduled with a single job ID (making it simple to track, prioritize or cancel all the jobs) with a number of separate tasks. When the example jobscript below is submitted, an array of 10,000 tasks will be generated and executed on the queue selected.
The alces template tool contains a base simple array job script which simply echo's:
echo "Executing job commands, current working directory is $(pwd)"
echo "This is an example array job, I was task number $SGE_TASK_ID and I ran on `hostname -s` as `whoami`" > $OUTPUT_PATH/test.output.$SGE_TASK_ID
echo "Output file has been generated, please check $OUTPUT_PATH/test.output"
Use the alces template copy tool previously shown to copy and edit the array job script.
The script can be submitted as normal with the qsub command and is displayed by grid-scheduler as a single job with multiple parts:
[alces-cluster@login1(awscluster) ~]$ qsub array_job.sh
Your job-array 7.1-10000:1 ("array_job.sh") has been submitted