1 [[!meta title="Batch Queue Job Control"]]
8 You can submit jobs to the batch queue for later proccessing with
9 `qsub`. Batch queueing can get pretty fancy, so `qsub` comes with
10 lots of options (see `man qsub`). For the most part, you can trust
11 your sysadmin to have set up some good defaults, and not worry about
12 setting any options explicitly. As you get used to the batch queue
13 system, you'll want tighter control of how your jobs execute by
14 invoking more sophisticated options yourself, but don't let that scare
15 you off at the beginning. They are, after all, only *options*. This
16 paper will give you a good start on the options I find myself using
22 The simplest example of a job submission is:
24 $ echo "sleep 30 && echo 'Running a job...'" | qsub
25 2705.n0.physics.drexel.edu
27 Which submits a job executing `sleep 30 && echo 'Running a job...'`
28 to the queue. The job gets an identifying ID in the queue, which
29 `qsub` prints to `stdout`.
31 You can check the status of your job in the queue with `qstat`.
34 Job id Name User Time Use S Queue
35 ----------------- ---------------- --------------- -------- - -----
36 2705.n0 STDIN sysadmin 0 Q batch
38 There is more information on `qstat` in the [qstat section][sec.qstat].
40 If your job is too complicated to fit on a single line, you can save
44 # file: echo_script.sh
51 and submit the script:
54 2706.n0.physics.drexel.edu
56 All the arguments discussed in later sections for the command line
57 should have comment-style analogs that you can enter in your script if
58 you use the script-submission approach with `qsub`.
60 Note that you *cannot* run executibles directly with `qsub`. For
66 $ qsub python script.py
68 will fail because `python` is an executible.
71 $ echo python script.py | qsub
73 wrap your [[Python]] script in a [[Bash]] script
80 or run your Python script directly (relying on the sha-bang)
84 IO: Job names and working directories
85 -------------------------------------
87 You will often be interested in the `stdout` and `stderr` output from
88 your jobs. The batch queue system saves this information for you (to
89 the directory from which you called `qsub`) in two files
90 `<jobname>.o<jobID-number>` and `<jobname>.e<jobID-number>`. Job IDs
91 we have seen before, they're just the numeric part of `qsub` output
92 (or the first field in the `qstat` output). Job IDs are assigned by
93 the batch queue server, and are unique to each job. Job names are
94 assigned by the job submitter (that's you) and need not be unique.
95 They give you a method for keeping track of what job is doing what
96 task, since you have no control over the job ID. The combined
97 `<jobname>.<jobID-number>` pair is both unique (for the server) and
98 recognizable (for the user), which is why it's used to label the
99 output data from a given job. You control the job name by passing the
100 `-N <jobname>` argument to `qsub`.
102 $ echo "sleep 30 && echo 'Running a job...'" | qsub -N myjob
103 2707.n0.physics.drexel.edu
105 Job id Name User Time Use S Queue
106 ----------------- ---------------- --------------- -------- - -----
107 2707.n0 myjob sysadmin 0 Q batch
109 Perhaps you are fine with `stdin` and `stdout`, but the default naming
110 scheme, even with the job name flexibility, is too restrictive. No
111 worries, `qsub` lets you specify exactly which files you'd like to use
112 with the unsurprisingly named `-o` and `-e` options.
114 $ echo "echo 'ABC' && echo 'DEF' > /dev/stderr" | qsub -o my_out -e my_err
115 2708.n0.physics.drexel.edu
122 A time will come when you are no longer satified with `stdin` and
123 `stdout` and you want to open your own files or worse, run a program!
124 Because no sane person uses absolute paths all the time, we need to
125 know what directory we're in so we can construct our relative paths.
126 You might expect that your job will execute from the same directory
127 that you called qsub from, but that is not the case. I think the
128 reason is that that directory is not garaunteed to exist on the host
129 that eventually runs your program. In any case, your job will begin
130 executing in your home directory. Writing relative paths from your
131 home directory is about as annoying as writing absolute paths, so
132 `qsub` gives your script a nifty environment variable `PBS_O_WORKDIR`,
133 which is set to the directory you called `qsub` from. Since *you*
134 know that this directory exists on the hosts (since the home
135 directories are NFS mounted on all of our cluster nodes), you can move
136 to that directory yourself, using something like
138 $ echo 'pwd && cd $PBS_O_WORKDIR && pwd' | qsub
139 2709.n0.physics.drexel.edu
143 /home/sysadmin/howto/cluster/pbs_queues
145 Note that if we had enclosed the echo argument in double quotes (`"`),
146 we would have to escape the `$` symbol in our `echo` argument so that
147 it survives the shell expansion and makes it safely into `qsub`'s
153 If you have jobs that may take longer than the default wall time
154 (currently 1 hour), you will need to tell the job manager. Walltimes
155 may seem annoying, since you don't really know how long a job will run
156 for, but they protect the cluster from people running broken programs
157 that waste nodes looping around forever without accomplishing
158 anything. Therefor, your walltime doesn't have to be exactly, or even
159 close to, your actual job execution time. Before submitting millions
160 of long jobs, it is a good idea to submit a timing job to see how long
161 your jobs should run for. Then set the walltime a factor of 10 or so
164 $ echo "time (sleep 30 && echo 'Running a job...')" | qsub -j oe
165 2710.n0.physics.drexel.edu
173 $ echo "sleep 30 && echo 'Running a job...'" | qsub -l walltime=15:00
174 2711.n0.physics.drexel.edu
175 $ qstat -f | grep '[.]walltime'
177 You can set walltimes in `[[H:]M:]S` format, where the number of
178 hours, minutes, and seconds are positive integers. I passed the `-j
179 oe` combines both `sdtout` and `stdin` streams on `stdin` because
180 `time` prints to `stderr`. Walltimes are only accurate on the order
181 of minutes and above, but you probably shouldn't be batch queueing
182 jobs that take less time anyway.
187 You will often find yourself in a situation where the execution of one
188 job depends on the output of another job. For example, `jobA` and
189 `jobB` generate some data, and `jobC` performs some analysis on that
190 data. It wouldn't do for `jobC` to go firing off as soon as there was
191 a free node, if there was no data available yet to analyze. We can
192 deal with *dependencies* like these by passing a `-W
193 depend=<dependency-list>` option to `qsub`. The dependency list can
194 get pretty fancy (see `man qsub`), but for the case outlined above,
195 we'll only need `afterany` dependencies (because `jobC` should execute
196 after jobs `A` and `B`).
198 Looking at the `man` page, the proper format for our dependency list
199 is `afterany:jobid[:jobid...]`, so we need to catch the job IDs output
200 by `qsub`. We'll use [[Bash's|Bash]] command substitution
201 (`$(command)`) for this.
203 $ AID=$(echo "cd \$PBS_O_WORKDIR && sleep 30 && echo \"we're in\" > A_out" | qsub)
204 $ BID=$(echo "cd \$PBS_O_WORKDIR && sleep 30 && pwd > B_out" | qsub)
205 $ COND="depend=afterany:$AID:$BID -o C_out -W depend=afterany:$AID:$BID"
206 $ CID=$(echo "cd \$PBS_O_WORKDIR && cat A_out B_out" | qsub -W depend=afterany:$AID:$BID -o C_out)
207 $ echo -e "A: $AID\nB: $BID\nC: $CID"
208 A: 2712.n0.physics.drexel.edu
209 B: 2713.n0.physics.drexel.edu
210 C: 2714.n0.physics.drexel.edu
212 Job id Name User Time Use S Queue
213 ------------------------- ---------------- --------------- -------- - -----
214 2712.n0 STDIN sysadmin 0 R batch
215 2713.n0 STDIN sysadmin 0 R batch
216 2714.n0 STDIN sysadmin 0 H batch
220 /home/sysadmin/howto/cluster/pbs_queues
222 Note that we have to escape the `PBS_O_WORKDIR` expansion so that the
223 variable substitution occurs when the job runs, and not when the echo
229 If you have *lots* of jobs you'd like to submit at once, it is tempting try
231 $ for i in $(seq 1 5); do JOBID=`echo "echo 'Running a job...'" | qsub`; done
233 This does work, but it puts quite a load on the server as the number
234 of jobs gets large. In order to allow the execution of such repeated
235 commands the batch server provides *job arrays*. You simply pass
236 `qsub` the `-t array_request` option, listing the range or list of IDs
237 for which you'd like to run your command.
239 $ echo "sleep 30 && echo 'Running job \$PBS_ARRAYID...'" | qsub -t 1-5
240 2721.n0.physics.drexel.edu
242 Job id Name User Time Use S Queue
243 ----------------- ---------------- --------------- -------- - -----
244 2721-1.n0 STDIN-1 sysadmin 0 R batch
245 2721-2.n0 STDIN-2 sysadmin 0 R batch
246 2721-3.n0 STDIN-3 sysadmin 0 R batch
247 2721-4.n0 STDIN-4 sysadmin 0 R batch
248 2721-5.n0 STDIN-5 sysadmin 0 R batch
250 One possibly tricky issue is depending on a job array. If you have an
251 analysis job that you need to run to compile the results of your whole
254 $ JOBID=$(echo "cd \$PBS_O_WORKDIR && sleep 30 && pwd && echo 1 > val\${PBS_ARRAYID}_out" | qsub -t 1-5)
255 $ sleep 2 # give the job a second to load in...
256 $ JOBNUM=$(echo $JOBID | cut -d. -f1)
257 $ COND="depend=afterany"
258 $ for i in $(seq 1 5); do COND="$COND:$JOBNUM-$i"; done
259 $ echo "cd \$PBS_O_WORKDIR && awk 'START{s=0}{s+=\$0}END{print s}' val*_out" | \
260 qsub -o sum_out -W $COND
261 2723.n0.physics.drexel.edu
264 Job id Name User Time Use S Queue
265 ----------------- ---------------- --------------- -------- - -----
266 2722-1.n0 STDIN-1 sysadmin 0 R batch
267 2722-2.n0 STDIN-2 sysadmin 0 R batch
268 2722-3.n0 STDIN-3 sysadmin 0 R batch
269 2722-4.n0 STDIN-4 sysadmin 0 R batch
270 2722-5.n0 STDIN-5 sysadmin 0 R batch
271 2723.n0 STDIN sysadmin 0 H batch
275 Note that you must create any files needed by the dependent jobs
276 *during* the early jobs. The dependent job may start as soon as the
277 early jobs finish, *before* the `stdin` and `stdout` files for some
278 early jobs have been written. Sadly, depending on either the returned
279 job ID or just its numeric portion doesn't seem to work.
281 It is important that the jobs on which you depend are loaded into the
282 server *before your depending job is submitted*. To ensure this, you
283 may need to add a reasonable sleep time between submitting your job
284 array and submitting your dependency. However, your depending job
285 will also hang if some early jobs have *already finished* by the time
286 you get around to submitting it. In practice, this is not much of a
287 problem, because your jobs will likely be running for at least a few
288 minutes, giving you a large window during which you can submit your
291 See the examples sections and `man qsub` for more details.
298 You can get information about currently running and queued jobs with
299 `qstat`. In the examples in the other sections, we've been using bare
300 `qstat`s to get information about the status of jobs in the queue.
301 You get information about a particular command with
303 $ JOBID=`echo "sleep 30 && echo 'Running a job...'" | qsub`
304 $ sleep 2 && qstat $JOBID
305 Job id Name User Time Use S Queue
306 ----------------- ---------------- --------------- -------- - -----
307 2724.n0 STDIN sysadmin 0 R batch
309 and you can get detailed information on a every command (or a
310 particular one, see previous example) with the `-f` (full) option.
312 $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub)
315 Job Id: 2725.n0.physics.drexel.edu
317 Job_Owner = sysadmin@n0.physics.drexel.edu
320 server = n0.physics.drexel.edu
322 ctime = Thu Jun 26 13:58:54 2008
323 Error_Path = n0.physics.drexel.edu:/home/sysadmin/STDIN.e2725
329 mtime = Thu Jun 26 13:58:55 2008
330 Output_Path = n0.physics.drexel.edu:/home/sysadmin/STDIN.o2725
332 qtime = Thu Jun 26 13:58:54 2008
334 Resource_List.nodect = 1
335 Resource_List.nodes = 1
336 Resource_List.walltime = 01:00:00
338 Variable_List = PBS_O_HOME=/home/sysadmin,PBS_O_LANG=en_US.UTF-8,
339 PBS_O_LOGNAME=sysadmin,
340 PBS_O_PATH=/home/sysadmin/bin:/usr/local/bin:/usr/local/sbin:/usr/bin
341 :/usr/sbin:/bin:/sbin:/usr/X11R6/bin:/usr/local/maui/bin:/home/sysadmi
342 n/script:/home/sysadmin/bin:.,PBS_O_MAIL=/var/mail/sysadmin,
343 PBS_O_SHELL=/bin/bash,PBS_SERVER=n0.physics.drexel.edu,
344 PBS_O_HOST=n0.physics.drexel.edu,
345 PBS_O_WORKDIR=/home/sysadmin/,
347 etime = Thu Jun 26 13:58:54 2008
348 start_time = Thu Jun 26 13:58:55 2008
351 The `qstat` command gives you lots of information about the current
352 state of a job, but to get a history you should use the `tracejob`
355 $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub)
356 $ sleep 2 && tracejob $JOBID
358 Job: 2726.n0.physics.drexel.edu
360 06/26/2008 13:58:57 S enqueuing into batch, state 1 hop 1
361 06/26/2008 13:58:57 S Job Queued at request of sysadmin@n0.physics.drexel.edu, owner = sysadmin@n0.physics.drexel.edu, job name = STDIN, queue = batch
362 06/26/2008 13:58:58 S Job Modified at request of root@n0.physics.drexel.edu
363 06/26/2008 13:58:58 S Job Run at request of root@n0.physics.drexel.edu
364 06/26/2008 13:58:58 S Job Modified at request of root@n0.physics.drexel.edu
366 You can also get the status of the queue itself by passing `-q` option to `qstat`
372 Queue Memory CPU Time Walltime Node Run Que Lm State
373 ---------------- ------ -------- -------- ---- --- --- -- -----
374 batch -- -- -- -- 2 0 -- E R
378 or the status of the server with the `-B` option.
381 Server Max Tot Que Run Hld Wat Trn Ext Status
382 ---------------- --- --- --- --- --- --- --- --- ----------
383 n0.physics.drexe 0 2 0 2 0 0 0 0 Active
385 You can get information on the status of the various nodes with
386 `qnodes` (a symlink to `pbsnodes`). The output of `qnodes` is bulky
387 and not of public interest, so we will not reproduce it here. For
388 more details on flags you can pass to `qnodes`/`pbsnodes` see `man
389 pbsnodes`, but I haven't had any need for fancyness yet.
393 Altering and deleting jobs
394 ==========================
396 Minor glitches in submitted jobs can be fixed by altering the job with `qalter`.
397 For example, incorrect dependencies may be causing a job to hold in the queue forever.
398 We can remove these invalid holds with
400 $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub -W depend=afterok:3)
402 Job id Name User Time Use S Queue
403 ----------------- ---------------- --------------- -------- - -----
404 2725.n0 STDIN sysadmin 0 R batch
405 2726.n0 STDIN sysadmin 0 R batch
406 2727.n0 STDIN sysadmin 0 H batch
409 Job id Name User Time Use S Queue
410 ----------------- ---------------- --------------- -------- - -----
411 2725.n0 STDIN sysadmin 0 R batch
412 2726.n0 STDIN sysadmin 0 R batch
413 2727.n0 STDIN sysadmin 0 Q batch
415 `qalter` is a Swiss-army-knife command, since it can change many
416 aspects of a job. The specific hold-release case above could also
417 have been handled with the `qrls` command. There are a number of
418 other `q*` commands which provide detailed control over jobs and
419 queues, but I haven't had to use them yet.
421 If you decide a job is beyond repair, you can kill it with `qdel`.
422 For obvious reasons, you can only kill your own jobs, unless your an
425 $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub)
427 $ echo "deleted $JOBID"
428 deleted 2728.n0.physics.drexel.edu
430 Job id Name User Time Use S Queue
431 ----------------- ---------------- --------------- -------- - -----
432 2725.n0 STDIN sysadmin 0 R batch
433 2726.n0 STDIN sysadmin 0 R batch
434 2727.n0 STDIN sysadmin 0 R batch
439 I used to have a number of scripts and hacks put together to make it
440 easy to run my [[sawsim]] Monte Carlo simulations and setup dependent
441 jobs to process the results. This system was never particularly
442 elegant. Over time, I gained access to a number of SMP machines, as
443 well as my multi host cluster. In order to support more general
444 parallelization and post-processing, I put together a general manager
445 for embarassingly parallel jobs. There are implementations using a
446 range of parallelizing tools, from multi-threading through PBS and
447 MPI. See the [sawsim source][sawsim-manager] for details.
452 [sec.qalter]: #qalter
453 [sawsim-manager]: http://git.tremily.us/?p=sawsim.git;a=tree;f=pysawsim/manager;hb=HEAD