airflow.providers.apache.spark.operators.spark_sql
¶
Module Contents¶
Classes¶
Execute Spark SQL query |
- class airflow.providers.apache.spark.operators.spark_sql.SparkSqlOperator(*, sql, conf=None, conn_id='spark_sql_default', total_executor_cores=None, executor_cores=None, executor_memory=None, keytab=None, principal=None, master=None, name='default-name', num_executors=None, verbose=True, yarn_queue=None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Execute Spark SQL query
See also
For more information on how to use this operator, take a look at the guide: SparkSqlOperator
- Parameters
sql (str) -- The SQL query to execute. (templated)
conf (Optional[str]) -- arbitrary Spark configuration property
conn_id (str) -- connection_id string
total_executor_cores (Optional[int]) -- (Standalone & Mesos only) Total cores for all executors (Default: all the available cores on the worker)
executor_cores (Optional[int]) -- (Standalone & YARN only) Number of cores per executor (Default: 2)
executor_memory (Optional[str]) -- Memory per executor (e.g. 1000M, 2G) (Default: 1G)
keytab (Optional[str]) -- Full path to the file that contains the keytab
master (Optional[str]) -- spark://host:port, mesos://host:port, yarn, or local (Default: The
host
andport
set in the Connection, or"yarn"
)name (str) -- Name of the job
num_executors (Optional[int]) -- Number of executors to launch
verbose (bool) -- Whether to pass the verbose flag to spark-sql
yarn_queue (Optional[str]) -- The YARN queue to submit to (Default: The
queue
value set in the Connection, or"default"
)