/<job-id>.stat (Optional) status of the job
/<job-id>.tmp Used temporarily by brun to store the script actually
sent to the host (can be inspected if something goes wrong)
+ /<job-id>.log (Optional) transcript of output produced by the job (including
+ previous failed attempts)
-<queue>/jobs/<job-id>.job All jobs issued on this queue, including those which
+<queue>/jobs/ All jobs issued on this queue, including those which
are no longer queued for any machine
+ /<job-id>.job Description of the job (see below)
+ /<job-id>.attach/ A directory containing attachments (if any)
+
+<queue>/history/<hostname>/ Successfully completed jobs (their .job, .stat and .log files)
+ are moved here if the keep_history config switch is set.
<queue>/log Log of actions on this queue. Lines look this way:
YYYY-MM-DD HH:MM:SS <host> <job-id> <status> [<msg>]
ID: <job-id> Identifier of the job, unique in the scope of a queue
Subject: <subject> Subject to be displayed to the user
+Prep: <command> Run <command> in a shell before the job body is executed;
+ $HOST contains the name of the host. This is useful for
+ example if you want to transfer data to the host by rsync.
+
+Whenever a user command wants a job ID, it accepts any substring starting
+at a component boundary (start of the ID or a "-"), as long as the substring
+is unique.
### Status files ###
Known fields:
Time: <timestamp> UNIX timestamp of the last status change
-Status: <code> Machine-readable status of the job:
- NOPING - host does not respond to ping
- NOXFER - transfer of the job body to a temporary file
- on the host has failed
- OK - job executed successfully (however, the job will
- be removed from the queue immediately, so you are
- not likely to see this code)
- FAILED - job failed to execute (i.e., it returned
- a non-zero exit code)
- INTERR - internal error of BEX
+Status: <code> Machine-readable status of the job (see below)
Message: <msg> (Optional) human-readable message explaining the status
+
+### Status codes ###
+
+FAILED Job failed to execute (i.e., it returned a non-zero exit code)
+INTERR Internal error of BEX (e.g., failed to read job prolog file)
+NEW Newly inserted job, which did not run yet
+NOPING Host does not respond to ping
+NOXFER Transfer of the job body to a temporary file on the host has failed
+OK Job finished successfully (this is usually not seen in the queue, since
+ finished jobs are immediately deleted or moved to the history)
+PREP Running preparatory commands (i.e., those present in Prep header field)
+PREPFAIL Preparatory commands failed (i.e., those present in Prep header field)
+REMOVED Job removed from the queue (behavior similar to OK)
+RUN Job is running
+
+Additional status codes recorded in the log files:
+
+REQUEUE Attempted to put on a queue, but it already was there
+
+Additional status codes sent only over status FIFO:
+
+DONE Done with the host (job equals "-")
+INIT Host or job ready, preparing to execute jobs
+LOCKED Host or job not available, because it is locked by another brun
+PING Trying to ping the host (job equals "-")
+SEND Sending job to the host