]> mj.ucw.cz Git - bex.git/blob - NOTES
Per-job logs
[bex.git] / NOTES
1 ### Structure of queue directories ###
2
3 <queue>/hosts/<hostname>/       Jobs queued for the given host
4                                 (they are executed in the lexicographic order of <job-id>s)
5         /<job-id>.job           Symlink to <queue>/jobs/<job-id>.job
6         /<job-id>.stat          (Optional) status of the job
7         /<job-id>.tmp           Used temporarily by brun to store the script actually
8                                 sent to the host (can be inspected if something goes wrong)
9         /<job-id>.log           (Optional) transcript of output produced by the job (including
10                                 previous failed attempts)
11
12 <queue>/jobs/<job-id>.job       All jobs issued on this queue, including those which
13                                 are no longer queued for any machine
14
15 <queue>/log                     Log of actions on this queue. Lines look this way:
16                                 YYYY-MM-DD HH:MM:SS <host> <job-id> <status> [<msg>]
17                                 <status> and <msg> correspond to "Status" and "Message"
18                                 in status files.
19
20 <queue>/status-fifo             FIFO used for reporting status of subprocesses by `bprun'
21
22 ### Job files ###
23
24 Mail-like structure. First come the headers (<keyword>:<spaces><value>), keywords are
25 case-sensitive, no multi-line fields allowed, then an empty line and then the body
26 (i.e., commands to be executed on the remote host).
27
28 Known header fields:
29
30 ID: <job-id>                    Identifier of the job, unique in the scope of a queue
31 Subject: <subject>              Subject to be displayed to the user
32
33 ### Status files ###
34
35 Structure identical to job headers, but they do not contain a body.
36
37 Known fields:
38
39 Time: <timestamp>               UNIX timestamp of the last status change
40 Status: <code>                  Machine-readable status of the job:
41                                 NOPING - host does not respond to ping
42                                 NOXFER - transfer of the job body to a temporary file
43                                      on the host has failed
44                                 RUN - job is running (present only in log files)
45                                 OK - job executed successfully (however, the job will
46                                      be removed from the queue immediately, so you are
47                                      not likely to see this code)
48                                 FAILED - job failed to execute (i.e., it returned
49                                          a non-zero exit code)
50                                 INTERR - internal error of BEX
51 Message: <msg>                  (Optional) human-readable message explaining the status