6-dSQ batch submission

dSQ batch submission

With the help of Job Array dSQ, you can quickly batch submit a group of jobs that use resources and execute tasks that are very similar, but with different parameters . The following are the instructions for using the Job Array dSQ :

Write a calculation task list file

Create a new file joblist.txt, and then enter the tasks to be calculated in the file, each line corresponds to a calculation task, such as:

gatk GenomicsDBImport --genomicsdb-workspace-path ./AKCR1;
gatk GenomicsDBImport --genomicsdb-workspace-path ./AKCR2;
gatk GenomicsDBImport --genomicsdb-workspace-path ./AKCR3;

Generate Slurm Job Submission Script Using dSQ

First execute module load dSQ to load the installed dSQ of the platform to the current terminal window, and then execute the following command to generate the Slurm job submission script

dsq --job-file joblist.txt -p q_cn -n 1 --mem-per-cpu 40g

joblist.txt is the task list file written in the previous step; -p q_cn indicates that the job is submitted to the q_cn queue; -n 1 indicates the core used by each computing task; --mem-per- cpu 40g indicates that each computing task uses 40g Memory

After the command is executed successfully, a `dsq-joblist-yyyy-mm-dd.sh` file will be generated in the current directory, and ` yyyy -mm -dd` is the creation date.

dsq-joblist-2019-08-01.sh:

#!/bin/bash
#SBATCH --array 0-9999
#SBATCH --output dsq-joblist-%A_%4a-%N.out
#SBATCH --job-name dsq-joblist
#SBATCH -p q_cn -n 1 --mem-per-cpu 40g
# DO NOT EDIT LINE BELOW
/usr/nzx-cluster/apps/dSQ/dSQBatch.py /GPFS/zhangli/DATA/vcf.call.dsq/joblist.txt /GPFS/zhangli/DATA/vcf.call.dsq

submit homework

Execute the following command to submit the job.computing jobs are in the joblist `joblist.txt` file and how many jobs will be submitted.

sbatch dsq-joblist-yyyy-mm-dd.sh

Job management

When a job ends, there will be a job_jobid_status.tsv file in the current directory, which records the following information about each job :

Job_ID : Job ID
Exit_Code : program exit code
Hostname: occupies the node name
Time_Started : start time
Time_Ended : end time
Time_Elapsed : total time elapsed
Job: run command
In addition, you can check and kill jobs through slurm 's squeue and scancel commands.

homework check

Run the following command:

dsqa jobsfile.txt job_2629186_status.tsv > failedjobs.txt 2> report.txt

using dSQ , execute module load dSQ to load the software into the current terminal environment.
Failedjobs.txt and report.txt files will be generated, which will record the number of jobs that run successfully and fail, and which jobs fail to run.

Cluster usage

6-dSQ batch submission