--- /dev/null
+[[!meta title="Batch Queue Job Control"]]
+
+<a name="qsub" />
+
+Submitting jobs
+===============
+
+You can submit jobs to the batch queue for later proccessing with
+`qsub`. Batch queueing can get pretty fancy, so `qsub` comes with
+lots of options (see `man qsub`). For the most part, you can trust
+your sysadmin to have set up some good defaults, and not worry about
+setting any options explicitly. As you get used to the batch queue
+system, you'll want tighter control of how your jobs execute by
+invoking more sophisticated options yourself, but don't let that scare
+you off at the beginning. They are, after all, only *options*. This
+paper will give you a good start on the options I find myself using
+most often.
+
+Simple submission
+-----------------
+
+The simplest example of a job submission is:
+
+ $ echo "sleep 30 && echo 'Running a job...'" | qsub
+ 2705.n0.physics.drexel.edu
+
+Which submits a job executing `sleep 30 && echo 'Running a job...'`
+to the queue. The job gets an identifying ID in the queue, which
+`qsub` prints to `stdout`.
+
+You can check the status of your job in the queue with `qstat`.
+
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2705.n0 STDIN sysadmin 0 Q batch
+
+There is more information on `qstat` in the [qstat section][sec.qstat].
+
+If your job is too complicated to fit on a single line, you can save
+it in a script:
+
+ #!/bin/bash
+ # file: echo_script.sh
+ sleep 30
+ echo "a really,"
+ echo "really,"
+ echo "complicated"
+ echo "script"
+
+and submit the script:
+
+ $ qsub echo_script.sh
+ 2706.n0.physics.drexel.edu
+
+All the arguments discussed in later sections for the command line
+should have comment-style analogs that you can enter in your script if
+you use the script-submission approach with `qsub`.
+
+Note that you *cannot* run executibles directly with `qsub`. For
+example
+
+ $ cat script.py
+ #!/usr/bin/python
+ print("hello world!")
+ $ qsub python script.py
+
+will fail because `python` is an executible.
+Either use
+
+ $ echo python script.py | qsub
+
+wrap your [[Python]] script in a [[Bash]] script
+
+ $ cat wrapper.sh
+ #!/bin/bash
+ python script.py
+ $ qsub wrapper.sh
+
+or run your Python script directly (relying on the sha-bang)
+
+ $ qsub script.py
+
+IO: Job names and working directories
+-------------------------------------
+
+You will often be interested in the `stdout` and `stderr` output from
+your jobs. The batch queue system saves this information for you (to
+the directory from which you called `qsub`) in two files
+`<jobname>.o<jobID-number>` and `<jobname>.e<jobID-number>`. Job IDs
+we have seen before, they're just the numeric part of `qsub` output
+(or the first field in the `qstat` output). Job IDs are assigned by
+the batch queue server, and are unique to each job. Job names are
+assigned by the job submitter (that's you) and need not be unique.
+They give you a method for keeping track of what job is doing what
+task, since you have no control over the job ID. The combined
+`<jobname>.<jobID-number>` pair is both unique (for the server) and
+recognizable (for the user), which is why it's used to label the
+output data from a given job. You control the job name by passing the
+`-N <jobname>` argument to `qsub`.
+
+ $ echo "sleep 30 && echo 'Running a job...'" | qsub -N myjob
+ 2707.n0.physics.drexel.edu
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2707.n0 myjob sysadmin 0 Q batch
+
+Perhaps you are fine with `stdin` and `stdout`, but the default naming
+scheme, even with the job name flexibility, is too restrictive. No
+worries, `qsub` lets you specify exactly which files you'd like to use
+with the unsurprisingly named `-o` and `-e` options.
+
+ $ echo "echo 'ABC' && echo 'DEF' > /dev/stderr" | qsub -o my_out -e my_err
+ 2708.n0.physics.drexel.edu
+ … time passes …
+ $ cat my_out
+ ABC
+ $ cat my_err
+ DEF
+
+A time will come when you are no longer satified with `stdin` and
+`stdout` and you want to open your own files or worse, run a program!
+Because no sane person uses absolute paths all the time, we need to
+know what directory we're in so we can construct our relative paths.
+You might expect that your job will execute from the same directory
+that you called qsub from, but that is not the case. I think the
+reason is that that directory is not garaunteed to exist on the host
+that eventually runs your program. In any case, your job will begin
+executing in your home directory. Writing relative paths from your
+home directory is about as annoying as writing absolute paths, so
+`qsub` gives your script a nifty environment variable `PBS_O_WORKDIR`,
+which is set to the directory you called `qsub` from. Since *you*
+know that this directory exists on the hosts (since the home
+directories are NFS mounted on all of our cluster nodes), you can move
+to that directory yourself, using something like
+
+ $ echo 'pwd && cd $PBS_O_WORKDIR && pwd' | qsub
+ 2709.n0.physics.drexel.edu
+ … time passes …
+ $ cat STDIN.o2709
+ /home/sysadmin
+ /home/sysadmin/howto/cluster/pbs_queues
+
+Note that if we had enclosed the echo argument in double quotes (`"`),
+we would have to escape the `$` symbol in our `echo` argument so that
+it survives the shell expansion and makes it safely into `qsub`'s
+input.
+
+Long jobs
+---------
+
+If you have jobs that may take longer than the default wall time
+(currently 1 hour), you will need to tell the job manager. Walltimes
+may seem annoying, since you don't really know how long a job will run
+for, but they protect the cluster from people running broken programs
+that waste nodes looping around forever without accomplishing
+anything. Therefor, your walltime doesn't have to be exactly, or even
+close to, your actual job execution time. Before submitting millions
+of long jobs, it is a good idea to submit a timing job to see how long
+your jobs should run for. Then set the walltime a factor of 10 or so
+higher. For example
+
+ $ echo "time (sleep 30 && echo 'Running a job...')" | qsub -j oe
+ 2710.n0.physics.drexel.edu
+ … time passes …
+ $ cat STDIN.o2710
+ Running a job...
+
+ real 0m30.013s
+ user 0m0.000s
+ sys 0m0.000s
+ $ echo "sleep 30 && echo 'Running a job...'" | qsub -l walltime=15:00
+ 2711.n0.physics.drexel.edu
+ $ qstat -f | grep '[.]walltime'
+
+You can set walltimes in `[[H:]M:]S` format, where the number of
+hours, minutes, and seconds are positive integers. I passed the `-j
+oe` combines both `sdtout` and `stdin` streams on `stdin` because
+`time` prints to `stderr`. Walltimes are only accurate on the order
+of minutes and above, but you probably shouldn't be batch queueing
+jobs that take less time anyway.
+
+Job dependencies
+----------------
+
+You will often find yourself in a situation where the execution of one
+job depends on the output of another job. For example, `jobA` and
+`jobB` generate some data, and `jobC` performs some analysis on that
+data. It wouldn't do for `jobC` to go firing off as soon as there was
+a free node, if there was no data available yet to analyze. We can
+deal with *dependencies* like these by passing a `-W
+depend=<dependency-list>` option to `qsub`. The dependency list can
+get pretty fancy (see `man qsub`), but for the case outlined above,
+we'll only need `afterany` dependencies (because `jobC` should execute
+after jobs `A` and `B`).
+
+Looking at the `man` page, the proper format for our dependency list
+is `afterany:jobid[:jobid...]`, so we need to catch the job IDs output
+by `qsub`. We'll use [[Bash's|Bash]] command substitution
+(`$(command)`) for this.
+
+ $ AID=$(echo "cd \$PBS_O_WORKDIR && sleep 30 && echo \"we're in\" > A_out" | qsub)
+ $ BID=$(echo "cd \$PBS_O_WORKDIR && sleep 30 && pwd > B_out" | qsub)
+ $ COND="depend=afterany:$AID:$BID -o C_out -W depend=afterany:$AID:$BID"
+ $ CID=$(echo "cd \$PBS_O_WORKDIR && cat A_out B_out" | qsub -W depend=afterany:$AID:$BID -o C_out)
+ $ echo -e "A: $AID\nB: $BID\nC: $CID"
+ A: 2712.n0.physics.drexel.edu
+ B: 2713.n0.physics.drexel.edu
+ C: 2714.n0.physics.drexel.edu
+ $ qstat
+ Job id Name User Time Use S Queue
+ ------------------------- ---------------- --------------- -------- - -----
+ 2712.n0 STDIN sysadmin 0 R batch
+ 2713.n0 STDIN sysadmin 0 R batch
+ 2714.n0 STDIN sysadmin 0 H batch
+ … time passes …
+ $ cat C_out
+ we're in
+ /home/sysadmin/howto/cluster/pbs_queues
+
+Note that we have to escape the `PBS_O_WORKDIR` expansion so that the
+variable substitution occurs when the job runs, and not when the echo
+command runs.
+
+Job arrays
+----------
+
+If you have *lots* of jobs you'd like to submit at once, it is tempting try
+
+ $ for i in $(seq 1 5); do JOBID=`echo "echo 'Running a job...'" | qsub`; done
+
+This does work, but it puts quite a load on the server as the number
+of jobs gets large. In order to allow the execution of such repeated
+commands the batch server provides *job arrays*. You simply pass
+`qsub` the `-t array_request` option, listing the range or list of IDs
+for which you'd like to run your command.
+
+ $ echo "sleep 30 && echo 'Running job \$PBS_ARRAYID...'" | qsub -t 1-5
+ 2721.n0.physics.drexel.edu
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2721-1.n0 STDIN-1 sysadmin 0 R batch
+ 2721-2.n0 STDIN-2 sysadmin 0 R batch
+ 2721-3.n0 STDIN-3 sysadmin 0 R batch
+ 2721-4.n0 STDIN-4 sysadmin 0 R batch
+ 2721-5.n0 STDIN-5 sysadmin 0 R batch
+
+One possibly tricky issue is depending on a job array. If you have an
+analysis job that you need to run to compile the results of your whole
+array, try
+
+ $ JOBID=$(echo "cd \$PBS_O_WORKDIR && sleep 30 && pwd && echo 1 > val\${PBS_ARRAYID}_out" | qsub -t 1-5)
+ $ sleep 2 # give the job a second to load in...
+ $ JOBNUM=$(echo $JOBID | cut -d. -f1)
+ $ COND="depend=afterany"
+ $ for i in $(seq 1 5); do COND="$COND:$JOBNUM-$i"; done
+ $ echo "cd \$PBS_O_WORKDIR && awk 'START{s=0}{s+=\$0}END{print s}' val*_out" | \
+ qsub -o sum_out -W $COND
+ 2723.n0.physics.drexel.edu
+
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2722-1.n0 STDIN-1 sysadmin 0 R batch
+ 2722-2.n0 STDIN-2 sysadmin 0 R batch
+ 2722-3.n0 STDIN-3 sysadmin 0 R batch
+ 2722-4.n0 STDIN-4 sysadmin 0 R batch
+ 2722-5.n0 STDIN-5 sysadmin 0 R batch
+ 2723.n0 STDIN sysadmin 0 H batch
+ $ cat sum_out
+ 5
+
+Note that you must create any files needed by the dependent jobs
+*during* the early jobs. The dependent job may start as soon as the
+early jobs finish, *before* the `stdin` and `stdout` files for some
+early jobs have been written. Sadly, depending on either the returned
+job ID or just its numeric portion doesn't seem to work.
+
+It is important that the jobs on which you depend are loaded into the
+server *before your depending job is submitted*. To ensure this, you
+may need to add a reasonable sleep time between submitting your job
+array and submitting your dependency. However, your depending job
+will also hang if some early jobs have *already finished* by the time
+you get around to submitting it. In practice, this is not much of a
+problem, because your jobs will likely be running for at least a few
+minutes, giving you a large window during which you can submit your
+dependent job.
+
+See the examples sections and `man qsub` for more details.
+
+<a name="qstat" />
+
+Querying
+========
+
+You can get information about currently running and queued jobs with
+`qstat`. In the examples in the other sections, we've been using bare
+`qstat`s to get information about the status of jobs in the queue.
+You get information about a particular command with
+
+ $ JOBID=`echo "sleep 30 && echo 'Running a job...'" | qsub`
+ $ sleep 2 && qstat $JOBID
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2724.n0 STDIN sysadmin 0 R batch
+
+and you can get detailed information on a every command (or a
+particular one, see previous example) with the `-f` (full) option.
+
+ $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub)
+ $ sleep 2
+ $ qstat -f
+ Job Id: 2725.n0.physics.drexel.edu
+ Job_Name = STDIN
+ Job_Owner = sysadmin@n0.physics.drexel.edu
+ job_state = R
+ queue = batch
+ server = n0.physics.drexel.edu
+ Checkpoint = u
+ ctime = Thu Jun 26 13:58:54 2008
+ Error_Path = n0.physics.drexel.edu:/home/sysadmin/STDIN.e2725
+ exec_host = n8/0
+ Hold_Types = n
+ Join_Path = n
+ Keep_Files = n
+ Mail_Points = a
+ mtime = Thu Jun 26 13:58:55 2008
+ Output_Path = n0.physics.drexel.edu:/home/sysadmin/STDIN.o2725
+ Priority = 0
+ qtime = Thu Jun 26 13:58:54 2008
+ Rerunable = True
+ Resource_List.nodect = 1
+ Resource_List.nodes = 1
+ Resource_List.walltime = 01:00:00
+ session_id = 18020
+ Variable_List = PBS_O_HOME=/home/sysadmin,PBS_O_LANG=en_US.UTF-8,
+ PBS_O_LOGNAME=sysadmin,
+ PBS_O_PATH=/home/sysadmin/bin:/usr/local/bin:/usr/local/sbin:/usr/bin
+ :/usr/sbin:/bin:/sbin:/usr/X11R6/bin:/usr/local/maui/bin:/home/sysadmi
+ n/script:/home/sysadmin/bin:.,PBS_O_MAIL=/var/mail/sysadmin,
+ PBS_O_SHELL=/bin/bash,PBS_SERVER=n0.physics.drexel.edu,
+ PBS_O_HOST=n0.physics.drexel.edu,
+ PBS_O_WORKDIR=/home/sysadmin/,
+ PBS_O_QUEUE=batch
+ etime = Thu Jun 26 13:58:54 2008
+ start_time = Thu Jun 26 13:58:55 2008
+ start_count = 1
+
+The `qstat` command gives you lots of information about the current
+state of a job, but to get a history you should use the `tracejob`
+command.
+
+ $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub)
+ $ sleep 2 && tracejob $JOBID
+
+ Job: 2726.n0.physics.drexel.edu
+
+ 06/26/2008 13:58:57 S enqueuing into batch, state 1 hop 1
+ 06/26/2008 13:58:57 S Job Queued at request of sysadmin@n0.physics.drexel.edu, owner = sysadmin@n0.physics.drexel.edu, job name = STDIN, queue = batch
+ 06/26/2008 13:58:58 S Job Modified at request of root@n0.physics.drexel.edu
+ 06/26/2008 13:58:58 S Job Run at request of root@n0.physics.drexel.edu
+ 06/26/2008 13:58:58 S Job Modified at request of root@n0.physics.drexel.edu
+
+You can also get the status of the queue itself by passing `-q` option to `qstat`
+
+ $ qstat -q
+
+ server: n0
+
+ Queue Memory CPU Time Walltime Node Run Que Lm State
+ ---------------- ------ -------- -------- ---- --- --- -- -----
+ batch -- -- -- -- 2 0 -- E R
+ ----- -----
+ 2 0
+
+or the status of the server with the `-B` option.
+
+ $ qstat -B
+ Server Max Tot Que Run Hld Wat Trn Ext Status
+ ---------------- --- --- --- --- --- --- --- --- ----------
+ n0.physics.drexe 0 2 0 2 0 0 0 0 Active
+
+You can get information on the status of the various nodes with
+`qnodes` (a symlink to `pbsnodes`). The output of `qnodes` is bulky
+and not of public interest, so we will not reproduce it here. For
+more details on flags you can pass to `qnodes`/`pbsnodes` see `man
+pbsnodes`, but I haven't had any need for fancyness yet.
+
+<a name="qalter" />
+
+Altering and deleting jobs
+==========================
+
+Minor glitches in submitted jobs can be fixed by altering the job with `qalter`.
+For example, incorrect dependencies may be causing a job to hold in the queue forever.
+We can remove these invalid holds with
+
+ $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub -W depend=afterok:3)
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2725.n0 STDIN sysadmin 0 R batch
+ 2726.n0 STDIN sysadmin 0 R batch
+ 2727.n0 STDIN sysadmin 0 H batch
+ $ qalter -h n $JOBID
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2725.n0 STDIN sysadmin 0 R batch
+ 2726.n0 STDIN sysadmin 0 R batch
+ 2727.n0 STDIN sysadmin 0 Q batch
+
+`qalter` is a Swiss-army-knife command, since it can change many
+aspects of a job. The specific hold-release case above could also
+have been handled with the `qrls` command. There are a number of
+other `q*` commands which provide detailed control over jobs and
+queues, but I haven't had to use them yet.
+
+If you decide a job is beyond repair, you can kill it with `qdel`.
+For obvious reasons, you can only kill your own jobs, unless your an
+administrator.
+
+ $ JOBID=$(echo "sleep 30 && echo 'Running a job...'" | qsub)
+ $ qdel $JOBID
+ $ echo "deleted $JOBID"
+ deleted 2728.n0.physics.drexel.edu
+ $ qstat
+ Job id Name User Time Use S Queue
+ ----------------- ---------------- --------------- -------- - -----
+ 2725.n0 STDIN sysadmin 0 R batch
+ 2726.n0 STDIN sysadmin 0 R batch
+ 2727.n0 STDIN sysadmin 0 R batch
+
+Further reading
+===============
+
+I used to have a number of scripts and hacks put together to make it
+easy to run my [[sawsim]] Monte Carlo simulations and setup dependent
+jobs to process the results. This system was never particularly
+elegant. Over time, I gained access to a number of SMP machines, as
+well as my multi host cluster. In order to support more general
+parallelization and post-processing, I put together a general manager
+for embarassingly parallel jobs. There are implementations using a
+range of parallelizing tools, from multi-threading through PBS and
+MPI. See the [sawsim source][sawsim-manager] for details.
+
+
+[sec.qsub]: #qsub
+[sec.qstat]: #qstat
+[sec.qalter]: #qalter
+[sawsim-manager]: http://git.tremily.us/?p=sawsim.git;a=tree;f=pysawsim/manager;hb=HEAD