airflow.providers.apache.pig.operators.pig

Module Contents

Classes

PigOperator

Executes pig script.

class airflow.providers.apache.pig.operators.pig.PigOperator(*, pig, pig_cli_conn_id='pig_cli_default', pigparams_jinja_translate=False, pig_opts=None, pig_properties=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Executes pig script.

Parameters
  • pig (str) – the pig latin script to be executed. (templated)

  • pig_cli_conn_id (str) – reference to the Hive database

  • pigparams_jinja_translate (bool) – when True, pig params-type templating ${var} gets translated into jinja-type templating {{ var }}. Note that you may want to use this along with the DAG(user_defined_macros=myargs) parameter. View the DAG object documentation for more details.

  • pig_opts (str | None) – pig options, such as: -x tez, -useHCatalog, … - space separated list

  • pig_properties (list[str] | None) – pig properties, additional pig properties passed as list

template_fields: Sequence[str] = ('pig', 'pig_opts', 'pig_properties')[source]
template_ext: Sequence[str] = ('.pig', '.piglatin')[source]
ui_color = '#f0e4ec'[source]
prepare_template()[source]

Execute after the templated fields get replaced by their content.

If you need your object to alter the content of the file before the template is rendered, it should override this method to do so.

execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

on_kill()[source]

Override this method to clean up subprocesses when a task instance gets killed.

Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.

Was this entry helpful?