airflow.providers.apache.spark.hooks.spark_sql
¶
Module Contents¶
-
class
airflow.providers.apache.spark.hooks.spark_sql.
SparkSqlHook
(sql: str, conf: Optional[str] = None, conn_id: str = default_conn_name, total_executor_cores: Optional[int] = None, executor_cores: Optional[int] = None, executor_memory: Optional[str] = None, keytab: Optional[str] = None, principal: Optional[str] = None, master: Optional[str] = None, name: str = 'default-name', num_executors: Optional[int] = None, verbose: bool = True, yarn_queue: Optional[str] = None)[source]¶ Bases:
airflow.hooks.base.BaseHook
This hook is a wrapper around the spark-sql binary. It requires that the "spark-sql" binary is in the PATH.
- Parameters
sql (str) -- The SQL query to execute
conf (str (format: PROP=VALUE)) -- arbitrary Spark configuration property
conn_id (str) -- connection_id string
total_executor_cores (int) -- (Standalone & Mesos only) Total cores for all executors (Default: all the available cores on the worker)
executor_cores (int) -- (Standalone & YARN only) Number of cores per executor (Default: 2)
executor_memory (str) -- Memory per executor (e.g. 1000M, 2G) (Default: 1G)
keytab (str) -- Full path to the file that contains the keytab
master (str) -- spark://host:port, mesos://host:port, yarn, or local (Default: The
host
andport
set in the Connection, or"yarn"
)name (str) -- Name of the job.
num_executors (int) -- Number of executors to launch
verbose (bool) -- Whether to pass the verbose flag to spark-sql
yarn_queue (str) -- The YARN queue to submit to (Default: The
queue
value set in the Connection, or"default"
)