airflow.providers.amazon.aws.hooks.s3
¶
Interact with AWS S3, using the boto3 library.
Module Contents¶
Functions¶
|
Function decorator that provides a bucket name taken from the connection |
|
Function decorator that unifies bucket name and key taken from the key |
Attributes¶
- airflow.providers.amazon.aws.hooks.s3.provide_bucket_name(func: T) T [source]¶
Function decorator that provides a bucket name taken from the connection in case no bucket name has been passed to the function.
- airflow.providers.amazon.aws.hooks.s3.unify_bucket_name_and_key(func: T) T [source]¶
Function decorator that unifies bucket name and key taken from the key in case no bucket name and at least a key has been passed to the function.
- class airflow.providers.amazon.aws.hooks.s3.S3Hook(*args, **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook
Interact with AWS S3, using the boto3 library.
Additional arguments (such as
aws_conn_id
) may be specified and are passed down to the underlying AwsBaseHook.See also
- static parse_s3_url(s3url: str) Tuple[str, str] [source]¶
Parses the S3 Url into a bucket name and key.
- Parameters
s3url -- The S3 Url to parse.
- Rtype s3url
str
- Returns
the parsed bucket name and key
- Return type
tuple of str
- check_for_bucket(self, bucket_name: Optional[str] = None) bool [source]¶
Check if bucket_name exists.
- get_bucket(self, bucket_name: Optional[str] = None) str [source]¶
Returns a boto3.S3.Bucket object
- Parameters
bucket_name (str) -- the name of the bucket
- Returns
the bucket object to the bucket name.
- Return type
boto3.S3.Bucket
- create_bucket(self, bucket_name: Optional[str] = None, region_name: Optional[str] = None) None [source]¶
Creates an Amazon S3 bucket.
- check_for_prefix(self, prefix: str, delimiter: str, bucket_name: Optional[str] = None) bool [source]¶
Checks that a prefix exists in a bucket
- list_prefixes(self, bucket_name: Optional[str] = None, prefix: Optional[str] = None, delimiter: Optional[str] = None, page_size: Optional[int] = None, max_items: Optional[int] = None) list [source]¶
Lists prefixes in a bucket under prefix
- list_keys(self, bucket_name: Optional[str] = None, prefix: Optional[str] = None, delimiter: Optional[str] = None, page_size: Optional[int] = None, max_items: Optional[int] = None) list [source]¶
Lists keys in a bucket under prefix and not containing delimiter
- check_for_key(self, key: str, bucket_name: Optional[str] = None) bool [source]¶
Checks if a key exists in a bucket
- get_key(self, key: str, bucket_name: Optional[str] = None) boto3.s3.transfer.S3Transfer [source]¶
Returns a boto3.s3.Object
- select_key(self, key: str, bucket_name: Optional[str] = None, expression: Optional[str] = None, expression_type: Optional[str] = None, input_serialization: Optional[Dict[str, Any]] = None, output_serialization: Optional[Dict[str, Any]] = None) str [source]¶
Reads a key with S3 Select.
- Parameters
key (str) -- S3 key that will point to the file
bucket_name (str) -- Name of the bucket in which the file is stored
expression (str) -- S3 Select expression
expression_type (str) -- S3 Select expression type
input_serialization (dict) -- S3 Select input data serialization format
output_serialization (dict) -- S3 Select output data serialization format
- Returns
retrieved subset of original data by S3 Select
- Return type
See also
For more details about S3 Select parameters: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.select_object_content
- check_for_wildcard_key(self, wildcard_key: str, bucket_name: Optional[str] = None, delimiter: str = '') bool [source]¶
Checks that a key matching a wildcard expression exists in a bucket
- get_wildcard_key(self, wildcard_key: str, bucket_name: Optional[str] = None, delimiter: str = '') boto3.s3.transfer.S3Transfer [source]¶
Returns a boto3.s3.Object object matching the wildcard expression
- load_file(self, filename: Union[pathlib.Path, str], key: str, bucket_name: Optional[str] = None, replace: bool = False, encrypt: bool = False, gzip: bool = False, acl_policy: Optional[str] = None) None [source]¶
Loads a local file to S3
- Parameters
filename (Union[Path, str]) -- path to the file to load.
key (str) -- S3 key that will point to the file
bucket_name (str) -- Name of the bucket in which to store the file
replace (bool) -- A flag to decide whether or not to overwrite the key if it already exists. If replace is False and the key exists, an error will be raised.
encrypt (bool) -- If True, the file will be encrypted on the server-side by S3 and will be stored in an encrypted form while at rest in S3.
gzip (bool) -- If True, the file will be compressed locally
acl_policy (str) -- String specifying the canned ACL policy for the file being uploaded to the S3 bucket.
- load_string(self, string_data: str, key: str, bucket_name: Optional[str] = None, replace: bool = False, encrypt: bool = False, encoding: Optional[str] = None, acl_policy: Optional[str] = None, compression: Optional[str] = None) None [source]¶
Loads a string to S3
This is provided as a convenience to drop a string in S3. It uses the boto infrastructure to ship a file to s3.
- Parameters
string_data (str) -- str to set as content for the key.
key (str) -- S3 key that will point to the file
bucket_name (str) -- Name of the bucket in which to store the file
replace (bool) -- A flag to decide whether or not to overwrite the key if it already exists
encrypt (bool) -- If True, the file will be encrypted on the server-side by S3 and will be stored in an encrypted form while at rest in S3.
encoding (str) -- The string to byte encoding
acl_policy (str) -- The string to specify the canned ACL policy for the object to be uploaded
compression (str) -- Type of compression to use, currently only gzip is supported.
- load_bytes(self, bytes_data: bytes, key: str, bucket_name: Optional[str] = None, replace: bool = False, encrypt: bool = False, acl_policy: Optional[str] = None) None [source]¶
Loads bytes to S3
This is provided as a convenience to drop a string in S3. It uses the boto infrastructure to ship a file to s3.
- Parameters
bytes_data (bytes) -- bytes to set as content for the key.
key (str) -- S3 key that will point to the file
bucket_name (str) -- Name of the bucket in which to store the file
replace (bool) -- A flag to decide whether or not to overwrite the key if it already exists
encrypt (bool) -- If True, the file will be encrypted on the server-side by S3 and will be stored in an encrypted form while at rest in S3.
acl_policy (str) -- The string to specify the canned ACL policy for the object to be uploaded
- load_file_obj(self, file_obj: io.BytesIO, key: str, bucket_name: Optional[str] = None, replace: bool = False, encrypt: bool = False, acl_policy: Optional[str] = None) None [source]¶
Loads a file object to S3
- Parameters
file_obj (file-like object) -- The file-like object to set as the content for the S3 key.
key (str) -- S3 key that will point to the file
bucket_name (str) -- Name of the bucket in which to store the file
replace (bool) -- A flag that indicates whether to overwrite the key if it already exists.
encrypt (bool) -- If True, S3 encrypts the file on the server, and the file is stored in encrypted form at rest in S3.
acl_policy (str) -- The string to specify the canned ACL policy for the object to be uploaded
- copy_object(self, source_bucket_key: str, dest_bucket_key: str, source_bucket_name: Optional[str] = None, dest_bucket_name: Optional[str] = None, source_version_id: Optional[str] = None, acl_policy: Optional[str] = None) None [source]¶
Creates a copy of an object that is already stored in S3.
Note: the S3 connection used here needs to have access to both source and destination bucket/key.
- Parameters
source_bucket_key (str) --
The key of the source object.
It can be either full s3:// style url or relative path from root level.
When it's specified as a full s3:// url, please omit source_bucket_name.
dest_bucket_key (str) --
The key of the object to copy to.
The convention to specify dest_bucket_key is the same as source_bucket_key.
source_bucket_name (str) --
Name of the S3 bucket where the source object is in.
It should be omitted when source_bucket_key is provided as a full s3:// url.
dest_bucket_name (str) --
Name of the S3 bucket to where the object is copied.
It should be omitted when dest_bucket_key is provided as a full s3:// url.
source_version_id (str) -- Version ID of the source object (OPTIONAL)
acl_policy (str) -- The string to specify the canned ACL policy for the object to be copied which is private by default.
- delete_bucket(self, bucket_name: str, force_delete: bool = False) None [source]¶
To delete s3 bucket, delete all s3 bucket objects and then delete the bucket.
- delete_objects(self, bucket: str, keys: Union[str, list]) None [source]¶
Delete keys from the bucket.
- Parameters
- download_file(self, key: str, bucket_name: Optional[str] = None, local_path: Optional[str] = None) str [source]¶
Downloads a file from the S3 location to the local file system.
- generate_presigned_url(self, client_method: str, params: Optional[dict] = None, expires_in: int = 3600, http_method: Optional[str] = None) Optional[str] [source]¶
Generate a presigned url given a client, its method, and arguments
- Parameters
client_method (str) -- The client method to presign for.
params (dict) -- The parameters normally passed to ClientMethod.
expires_in (int) -- The number of seconds the presigned url is valid for. By default it expires in an hour (3600 seconds).
http_method (str) -- The http method to use on the generated url. By default, the http method is whatever is used in the method's model.
- Returns
The presigned url.
- Return type
- get_bucket_tagging(self, bucket_name: Optional[str] = None) Optional[List[Dict[str, str]]] [source]¶
Gets a List of tags from a bucket.