Task execution, arguments and environment
Command-line arguments
Bistro invokes your task binary with three arguments
1
command node_name status_pipe config_JSON
-
command
is the path to your binary, set as part of the job settings, or passed via--worker_command
to the scheduler forLocalRunner
, otherwise to the worker. -
node_name
is described in Nodes and Resources. It tells for your binary which data shard to process. -
status_pipe
is a temporary file for communicating the task status to Bistro. You should write only one line to it after the task is done or throws an error. This line a can be a plain Task Status string likedone
,incomplete
, orerror_backoff
. It can also be a JSON object of the form1
{"result": "done", "data": {"your data up to a few KB"}}
Instead of
result
, you could also useresult_bits
, see bits.thrift.For example, Bistro always marks the task done if
command
is the following script:1
echo "done" > "$2"
-
config_JSON
includes extra arguments for your binary. By default it has:id
: your job namepath_to_node
: an array of node names from root tonode_name
prev_status
: status result in JSON of the last run of this taskconfig
: all key value pairs in “config” of your job configuration. The “config” field allows you to change command line arguments for future task invocations while your job runs, without restarting Bistro.
Logging
Bistro sets up your task with stdout and stderr file descriptors, which it reads line-by-line (lines have a optional maximum length), timestamps, rate-limits (see “max_log_lines_per_poll_interval” in “Managing task processes”), and writes to a SQLite database on the local disk.
To retrieve the logs, send the scheduler a task_logs
REST request,
read
handleTaskLogs() for the details.
Working directory
A given Bistro worker (or a LocalRunner
scheduler) starts all tasks of the
same job in the sam directory: --data_dir
/ jobs
/ <NAME OF JOB>
.