airflow.providers.amazon.aws.transfers.sql_to_s3

Module Contents

Classes

SqlToS3Operator

Saves data from an specific SQL query into a file in S3.

Attributes

FILE_FORMAT

FileOptions

FILE_OPTIONS_MAP

airflow.providers.amazon.aws.transfers.sql_to_s3.FILE_FORMAT[source]
airflow.providers.amazon.aws.transfers.sql_to_s3.FileOptions[source]
airflow.providers.amazon.aws.transfers.sql_to_s3.FILE_OPTIONS_MAP[source]
class airflow.providers.amazon.aws.transfers.sql_to_s3.SqlToS3Operator(*, query, s3_bucket, s3_key, sql_conn_id, parameters=None, replace=False, aws_conn_id='aws_default', verify=None, file_format='csv', pd_kwargs=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Saves data from an specific SQL query into a file in S3.

Parameters
  • query (str) -- the sql query to be executed. If you want to execute a file, place the absolute path of it, ending with .sql extension. (templated)

  • s3_bucket (str) -- bucket where the data will be stored. (templated)

  • s3_key (str) -- desired key for the file. It includes the name of the file. (templated)

  • replace (bool) -- whether or not to replace the file in S3 if it previously existed

  • sql_conn_id (str) -- reference to a specific database.

  • parameters (Union[None, Mapping, Iterable]) -- (optional) the parameters to render the SQL query with.

  • aws_conn_id (str) -- reference to a specific S3 connection

  • verify (Optional[Union[bool, str]]) --

    Whether or not to verify SSL certificates for S3 connection. By default SSL certificates are verified. You can provide the following values:

    • False: do not validate SSL certificates. SSL will still be used

      (unless use_ssl is False), but SSL certificates will not be verified.

    • path/to/cert/bundle.pem: A filename of the CA cert bundle to uses.

      You can specify this argument if you want to use a different CA cert bundle than the one used by botocore.

  • file_format (typing_extensions.Literal[csv, json, parquet]) -- the destination file format, only string 'csv', 'json' or 'parquet' is accepted.

  • pd_kwargs (Optional[dict]) -- arguments to include in DataFrame .to_parquet(), .to_json() or .to_csv().

template_fields :Sequence[str] = ['s3_bucket', 's3_key', 'query'][source]
template_ext :Sequence[str] = ['.sql'][source]
template_fields_renderers[source]
execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?