:mod:`airflow.contrib.hooks.spark_sql_hook` =========================================== .. py:module:: airflow.contrib.hooks.spark_sql_hook Module Contents --------------- .. py:class:: SparkSqlHook(sql, conf=None, conn_id='spark_sql_default', total_executor_cores=None, executor_cores=None, executor_memory=None, keytab=None, principal=None, master='yarn', name='default-name', num_executors=None, verbose=True, yarn_queue='default') Bases::class:`airflow.hooks.base_hook.BaseHook` This hook is a wrapper around the spark-sql binary. It requires that the "spark-sql" binary is in the PATH. :param sql: The SQL query to execute :type sql: str :param conf: arbitrary Spark configuration property :type conf: str (format: PROP=VALUE) :param conn_id: connection_id string :type conn_id: str :param total_executor_cores: (Standalone & Mesos only) Total cores for all executors (Default: all the available cores on the worker) :type total_executor_cores: int :param executor_cores: (Standalone & YARN only) Number of cores per executor (Default: 2) :type executor_cores: int :param executor_memory: Memory per executor (e.g. 1000M, 2G) (Default: 1G) :type executor_memory: str :param keytab: Full path to the file that contains the keytab :type keytab: str :param master: spark://host:port, mesos://host:port, yarn, or local :type master: str :param name: Name of the job. :type name: str :param num_executors: Number of executors to launch :type num_executors: int :param verbose: Whether to pass the verbose flag to spark-sql :type verbose: bool :param yarn_queue: The YARN queue to submit to (Default: "default") :type yarn_queue: str .. method:: get_conn(self) .. method:: _prepare_command(self, cmd) Construct the spark-sql command to execute. Verbose output is enabled as default. :param cmd: command to append to the spark-sql command :type cmd: str :return: full command to be executed .. method:: run_query(self, cmd='', **kwargs) Remote Popen (actually execute the Spark-sql query) :param cmd: command to remotely execute :param kwargs: extra arguments to Popen (see subprocess.Popen) .. method:: kill(self)