Skip to end of metadata
Go to start of metadata

When you want to run many almost identical jobs simultaneously, perhaps running the same program many times but changing the input data or some argument or parameter. One possible solution is to write a script to create all the qsub files and then write a BASH script to execute them. This is very time consuming and might end up submitting many more jobs to the queue than you actually need to. This is a typical problem suited to an SGE task array.

Advantages of Array Jobs: 
  • You only need to submit one job to run a series of very similar tasks;

  • These tasks are independent and do not all need to run at once so the job scheduler can efficiently run one or more queued tasks as the requested computational resources become available;

  • They are particularly useful for Embarrassingly Parallel problems such as:

    • Monte Carlo simulations (where $SGE_TASK_ID might correspond to random number seed);

    • Parameter sensitivity analysis;

    • Batch file processing (where $SGE_TASK_ID might refer to a file in a list of files to be processed). 

Example :

make job submit script called as following, in this example, myprog will run 72 times through your subjects list after you submit with

$qsub -v subjects=subjects.txt
more qsub flag options please refer to SGE document.

  • No labels