airflow.providers.amazon.aws.operators.glue_databrew¶
Classes¶
| Start an AWS Glue DataBrew job. | 
Module Contents¶
- class airflow.providers.amazon.aws.operators.glue_databrew.GlueDataBrewStartJobOperator(job_name, wait_for_completion=True, delay=None, waiter_delay=30, waiter_max_attempts=60, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
- Bases: - airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator[- airflow.providers.amazon.aws.hooks.glue_databrew.GlueDataBrewHook]- Start an AWS Glue DataBrew job. - AWS Glue DataBrew is a visual data preparation tool that makes it easier for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). - See also - For more information on how to use this operator, take a look at the guide: Start an AWS Glue DataBrew job - Parameters:
- job_name (str) – unique job name per AWS Account 
- wait_for_completion (bool) – Whether to wait for job run completion. (default: True) 
- deferrable (bool) – If True, the operator will wait asynchronously for the job to complete. This implies waiting for completion. This mode requires aiobotocore module to be installed. (default: False) 
- waiter_delay (int) – Time in seconds to wait between status checks. Default is 30. 
- waiter_max_attempts (int) – Maximum number of attempts to check for job completion. (default: 60) 
- aws_conn_id – The Airflow connection used for AWS credentials. If this is - Noneor empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).
- region_name – AWS region_name. If not specified then the default boto3 behaviour is used. 
- verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html 
- botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html 
 
- Returns:
- dictionary with key run_id and value of the resulting job’s run_id. 
 - template_fields: collections.abc.Sequence[str][source]¶