airflow.providers.snowflake.operators.snowflake
¶
Module Contents¶
Classes¶
Executes SQL code in a Snowflake database. |
|
Performs a check against Snowflake. |
|
Performs a simple check using sql code against a specified value, within a certain level of tolerance. |
|
Checks that the metrics given as SQL expressions are within tolerance of the ones from days_back before. |
|
Implemented Snowflake SQL API Operator to support multiple SQL statements sequentially, |
- class airflow.providers.snowflake.operators.snowflake.SnowflakeOperator(*, snowflake_conn_id='snowflake_default', warehouse=None, database=None, role=None, schema=None, authenticator=None, session_parameters=None, **kwargs)[source]¶
Bases:
airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator
Executes SQL code in a Snowflake database.
This class is deprecated.
Please use
airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator
.See also
For more information on how to use this operator, take a look at the guide: SnowflakeOperator
- Parameters
snowflake_conn_id (str) – Reference to Snowflake connection id
sql – the SQL code to be executed as a single string, or a list of str (sql statements), or a reference to a template file. Template references are recognized by str ending in ‘.sql’
parameters – (optional) the parameters to render the SQL query with.
warehouse (str | None) – name of warehouse (will overwrite any warehouse defined in the connection’s extra JSON)
database (str | None) – name of database (will overwrite database defined in connection)
schema (str | None) – name of schema (will overwrite schema defined in connection)
role (str | None) – name of role (will overwrite any role defined in connection’s extra JSON)
authenticator (str | None) – authenticator for Snowflake. ‘snowflake’ (default) to use the internal Snowflake authenticator ‘externalbrowser’ to authenticate using your web browser and Okta, ADFS or any other SAML 2.0-compliant identify provider (IdP) that has been defined for your account ‘https://<your_okta_account_name>.okta.com’ to authenticate through native Okta.
session_parameters (dict | None) – You can set session-level parameters at the time you connect to Snowflake
- Return Returns list of dictionaries in { ‘column’
‘value’, ‘column2’: ‘value2’ } form.
- class airflow.providers.snowflake.operators.snowflake.SnowflakeCheckOperator(*, sql, snowflake_conn_id='snowflake_default', parameters=None, autocommit=True, do_xcom_push=True, warehouse=None, database=None, role=None, schema=None, authenticator=None, session_parameters=None, **kwargs)[source]¶
Bases:
airflow.providers.common.sql.operators.sql.SQLCheckOperator
Performs a check against Snowflake.
The
SnowflakeCheckOperator
expects a sql query that will return a single row. Each value on that first row is evaluated using pythonbool
casting. If any of the values returnFalse
the check is failed and errors out.Note that Python bool casting evals the following as
False
:False
0
Empty string (
""
)Empty list (
[]
)Empty dictionary or set (
{}
)
Given a query like
SELECT COUNT(*) FROM foo
, it will fail only if the count== 0
. You can craft much more complex query that could, for instance, check that the table has the same number of rows as the source table upstream, or that the count of today’s partition is greater than yesterday’s partition, or that a set of metrics are less than 3 standard deviation for the 7 day average.This operator can be used as a data quality check in your pipeline, and depending on where you put it in your DAG, you have the choice to stop the critical path, preventing from publishing dubious data, or on the side and receive email alerts without stopping the progress of the DAG.
- Parameters
sql (str) – the SQL code to be executed as a single string, or a list of str (sql statements), or a reference to a template file. Template references are recognized by str ending in ‘.sql’
snowflake_conn_id (str) – Reference to Snowflake connection id
autocommit (bool) – if True, each command is automatically committed. (default value: True)
parameters (Iterable | Mapping[str, Any] | None) – (optional) the parameters to render the SQL query with.
warehouse (str | None) – name of warehouse (will overwrite any warehouse defined in the connection’s extra JSON)
database (str | None) – name of database (will overwrite database defined in connection)
schema (str | None) – name of schema (will overwrite schema defined in connection)
role (str | None) – name of role (will overwrite any role defined in connection’s extra JSON)
authenticator (str | None) – authenticator for Snowflake. ‘snowflake’ (default) to use the internal Snowflake authenticator ‘externalbrowser’ to authenticate using your web browser and Okta, ADFS or any other SAML 2.0-compliant identify provider (IdP) that has been defined for your account ‘https://<your_okta_account_name>.okta.com’ to authenticate through native Okta.
session_parameters (dict | None) – You can set session-level parameters at the time you connect to Snowflake
- class airflow.providers.snowflake.operators.snowflake.SnowflakeValueCheckOperator(*, sql, pass_value, tolerance=None, snowflake_conn_id='snowflake_default', parameters=None, autocommit=True, do_xcom_push=True, warehouse=None, database=None, role=None, schema=None, authenticator=None, session_parameters=None, **kwargs)[source]¶
Bases:
airflow.providers.common.sql.operators.sql.SQLValueCheckOperator
Performs a simple check using sql code against a specified value, within a certain level of tolerance.
- Parameters
sql (str) – the sql to be executed
pass_value (Any) – the value to check against
tolerance (Any) – (optional) the tolerance allowed to accept the query as passing
snowflake_conn_id (str) – Reference to Snowflake connection id
autocommit (bool) – if True, each command is automatically committed. (default value: True)
parameters (Iterable | Mapping[str, Any] | None) – (optional) the parameters to render the SQL query with.
warehouse (str | None) – name of warehouse (will overwrite any warehouse defined in the connection’s extra JSON)
database (str | None) – name of database (will overwrite database defined in connection)
schema (str | None) – name of schema (will overwrite schema defined in connection)
role (str | None) – name of role (will overwrite any role defined in connection’s extra JSON)
authenticator (str | None) – authenticator for Snowflake. ‘snowflake’ (default) to use the internal Snowflake authenticator ‘externalbrowser’ to authenticate using your web browser and Okta, ADFS or any other SAML 2.0-compliant identify provider (IdP) that has been defined for your account ‘https://<your_okta_account_name>.okta.com’ to authenticate through native Okta.
session_parameters (dict | None) – You can set session-level parameters at the time you connect to Snowflake
- class airflow.providers.snowflake.operators.snowflake.SnowflakeIntervalCheckOperator(*, table, metrics_thresholds, date_filter_column='ds', days_back=-7, snowflake_conn_id='snowflake_default', parameters=None, autocommit=True, do_xcom_push=True, warehouse=None, database=None, role=None, schema=None, authenticator=None, session_parameters=None, **kwargs)[source]¶
Bases:
airflow.providers.common.sql.operators.sql.SQLIntervalCheckOperator
Checks that the metrics given as SQL expressions are within tolerance of the ones from days_back before.
This method constructs a query like so
SELECT {metrics_threshold_dict_key} FROM {table} WHERE {date_filter_column}=<date>
- Parameters
table (str) – the table name
days_back (SupportsAbs[int]) – number of days between ds and the ds we want to check against. Defaults to 7 days
metrics_thresholds (dict) – a dictionary of ratios indexed by metrics, for example ‘COUNT(*)’: 1.5 would require a 50 percent or less difference between the current day, and the prior days_back.
snowflake_conn_id (str) – Reference to Snowflake connection id
autocommit (bool) – if True, each command is automatically committed. (default value: True)
parameters (Iterable | Mapping[str, Any] | None) – (optional) the parameters to render the SQL query with.
warehouse (str | None) – name of warehouse (will overwrite any warehouse defined in the connection’s extra JSON)
database (str | None) – name of database (will overwrite database defined in connection)
schema (str | None) – name of schema (will overwrite schema defined in connection)
role (str | None) – name of role (will overwrite any role defined in connection’s extra JSON)
authenticator (str | None) – authenticator for Snowflake. ‘snowflake’ (default) to use the internal Snowflake authenticator ‘externalbrowser’ to authenticate using your web browser and Okta, ADFS or any other SAML 2.0-compliant identify provider (IdP) that has been defined for your account ‘https://<your_okta_account_name>.okta.com’ to authenticate through native Okta.
session_parameters (dict | None) – You can set session-level parameters at the time you connect to Snowflake
- class airflow.providers.snowflake.operators.snowflake.SnowflakeSqlApiOperator(*, snowflake_conn_id='snowflake_default', warehouse=None, database=None, role=None, schema=None, authenticator=None, session_parameters=None, poll_interval=5, statement_count=0, token_life_time=LIFETIME, token_renewal_delta=RENEWAL_DELTA, bindings=None, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.common.sql.operators.sql.SQLExecuteQueryOperator
Implemented Snowflake SQL API Operator to support multiple SQL statements sequentially, which is the behavior of the SnowflakeOperator, the Snowflake SQL API allows submitting multiple SQL statements in a single request. It make post request to submit SQL statements for execution, poll to check the status of the execution of a statement. Fetch query results concurrently. This Operator currently uses key pair authentication, so you need to provide private key raw content or private key file path in the snowflake connection along with other details
- Where can this operator fit in?
To execute multiple SQL statements in a single request
To execute the SQL statement asynchronously and to execute standard queries and most DDL and DML statements
To develop custom applications and integrations that perform queries
To create provision users and roles, create table, etc.
- The following commands are not supported:
The PUT command (in Snowflake SQL)
The GET command (in Snowflake SQL)
The CALL command with stored procedures that return a table(stored procedures with the RETURNS TABLE clause).
- Parameters
snowflake_conn_id (str) – Reference to Snowflake connection id
sql – the sql code to be executed. (templated)
autocommit – if True, each command is automatically committed. (default value: True)
parameters – (optional) the parameters to render the SQL query with.
warehouse (str | None) – name of warehouse (will overwrite any warehouse defined in the connection’s extra JSON)
database (str | None) – name of database (will overwrite database defined in connection)
schema (str | None) – name of schema (will overwrite schema defined in connection)
role (str | None) – name of role (will overwrite any role defined in connection’s extra JSON)
authenticator (str | None) – authenticator for Snowflake. ‘snowflake’ (default) to use the internal Snowflake authenticator ‘externalbrowser’ to authenticate using your web browser and Okta, ADFS or any other SAML 2.0-compliant identify provider (IdP) that has been defined for your account ‘https://<your_okta_account_name>.okta.com’ to authenticate through native Okta.
session_parameters (dict[str, Any] | None) – You can set session-level parameters at the time you connect to Snowflake
poll_interval (int) – the interval in seconds to poll the query
statement_count (int) – Number of SQL statement to be executed
token_life_time (datetime.timedelta) – lifetime of the JWT Token
token_renewal_delta (datetime.timedelta) – Renewal time of the JWT Token
bindings (dict[str, Any] | None) – (Optional) Values of bind variables in the SQL statement. When executing the statement, Snowflake replaces placeholders (? and :name) in the statement with these specified values.
deferrable (bool) – Run operator in the deferrable mode.