airflow.providers.amazon.aws.transfers.mysql_to_s3
¶
Module Contents¶
Classes¶
Saves data from an specific MySQL query into a file in S3. |
Attributes¶
- class airflow.providers.amazon.aws.transfers.mysql_to_s3.MySQLToS3Operator(*, query: str, s3_bucket: str, s3_key: str, replace: bool = False, mysql_conn_id: str = 'mysql_default', aws_conn_id: str = 'aws_default', verify: Optional[Union[bool, str]] = None, pd_csv_kwargs: Optional[dict] = None, index: bool = False, header: bool = False, file_format: typing_extensions.Literal[csv, parquet] = 'csv', pd_kwargs: Optional[dict] = None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Saves data from an specific MySQL query into a file in S3.
- Parameters
query (str) -- the sql query to be executed. If you want to execute a file, place the absolute path of it, ending with .sql extension. (templated)
s3_bucket (str) -- bucket where the data will be stored. (templated)
s3_key (str) -- desired key for the file. It includes the name of the file. (templated)
replace (bool) -- whether or not to replace the file in S3 if it previously existed
mysql_conn_id (str) -- Reference to mysql connection id.
aws_conn_id (str) -- reference to a specific S3 connection
Whether or not to verify SSL certificates for S3 connection. By default SSL certificates are verified. You can provide the following values:
False
: do not validate SSL certificates. SSL will still be used(unless use_ssl is False), but SSL certificates will not be verified.
path/to/cert/bundle.pem
: A filename of the CA cert bundle to uses.You can specify this argument if you want to use a different CA cert bundle than the one used by botocore.
pd_csv_kwargs (dict) -- arguments to include in pd.to_csv (header, index, columns...)
index (str) -- whether to have the index or not in the dataframe
header (bool) -- whether to include header or not into the S3 file
file_format (str) -- the destination file format, only string 'csv' or 'parquet' is accepted.
pd_kwargs (dict) -- arguments to include in
DataFrame.to_parquet()
orDataFrame.to_csv()
. This is preferred thanpd_csv_kwargs
.