airflow.operators.bash

Module Contents

Classes

BashOperator

Execute a Bash script, command or set of commands.

class airflow.operators.bash.BashOperator(*, bash_command, env=None, append_env=False, output_encoding='utf-8', skip_exit_code=99, cwd=None, **kwargs)[source]

Bases: airflow.models.baseoperator.BaseOperator

Execute a Bash script, command or set of commands.

See also

For more information on how to use this operator, take a look at the guide: BashOperator

If BaseOperator.do_xcom_push is True, the last line written to stdout will also be pushed to an XCom when the bash command completes

Parameters
  • bash_command (str) -- The command, set of commands or reference to a bash script (must be '.sh') to be executed. (templated)

  • env (Optional[Dict[str, str]]) -- If env is not None, it must be a dict that defines the environment variables for the new process; these are used instead of inheriting the current process environment, which is the default behavior. (templated)

  • append_env (bool) -- If False(default) uses the environment variables passed in env params and does not inherit the current process environment. If True, inherits the environment variables from current passes and then environment variable passed by the user will either update the existing inherited environment variables or the new variables gets appended to it

  • output_encoding (str) -- Output encoding of bash command

  • skip_exit_code (int) -- If task exits with this exit code, leave the task in skipped state (default: 99). If set to None, any non-zero exit code will be treated as a failure.

  • cwd (Optional[str]) -- Working directory to execute the command in. If None (default), the command is run in a temporary directory.

Airflow will evaluate the exit code of the bash command. In general, a non-zero exit code will result in task failure and zero will result in task success. Exit code 99 (or another set in skip_exit_code) will throw an airflow.exceptions.AirflowSkipException, which will leave the task in skipped state. You can have all non-zero exit codes be treated as a failure by setting skip_exit_code=None.

Exit code

Behavior

0

success

skip_exit_code (default: 99)

raise airflow.exceptions.AirflowSkipException

otherwise

raise airflow.exceptions.AirflowException

Note

Airflow will not recognize a non-zero exit code unless the whole shell exit with a non-zero exit code. This can be an issue if the non-zero exit arises from a sub-command. The easiest way of addressing this is to prefix the command with set -e;

Example: .. code-block:: python

bash_command = "set -e; python3 script.py '{{ next_execution_date }}'"

Note

Add a space after the script name when directly calling a .sh script with the bash_command argument -- for example bash_command="my_script.sh ". This is because Airflow tries to apply load this file and process it as a Jinja template to it ends with .sh, which will likely not be what most users want.

Warning

Care should be taken with "user" input or when using Jinja templates in the bash_command, as this bash operator does not perform any escaping or sanitization of the command.

This applies mostly to using "dag_run" conf, as that can be submitted via users in the Web UI. Most of the default template variables are not at risk.

For example, do not do this:

bash_task = BashOperator(
    task_id="bash_task",
    bash_command='echo "Here is the message: \'{{ dag_run.conf["message"] if dag_run else "" }}\'"',
)

Instead, you should pass this via the env kwarg and use double-quotes inside the bash_command, as below:

bash_task = BashOperator(
    task_id="bash_task",
    bash_command="echo \"here is the message: '$message'\"",
    env={"message": '{{ dag_run.conf["message"] if dag_run else "" }}'},
)
template_fields :Sequence[str] = ['bash_command', 'env'][source]
template_fields_renderers[source]
template_ext :Sequence[str] = ['.sh', '.bash'][source]
ui_color = #f0ede4[source]
subprocess_hook(self)[source]

Returns hook for running the bash command

get_env(self, context)[source]

Builds the set of environment variables to be exposed for the bash command

execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

on_kill(self)[source]

Override this method to cleanup subprocesses when a task instance gets killed. Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up or it will leave ghost processes behind.

Was this entry helpful?