:mod:`airflow.contrib.hooks.gcp_dataproc_hook` ============================================== .. py:module:: airflow.contrib.hooks.gcp_dataproc_hook Module Contents --------------- .. py:class:: _DataProcJob(dataproc_api, project_id, job, region='global', job_error_states=None, num_retries=None) Bases: :class:`airflow.utils.log.logging_mixin.LoggingMixin` .. method:: wait_for_done(self) .. method:: raise_error(self, message=None) .. method:: get(self) .. py:class:: _DataProcJobBuilder(project_id, task_id, cluster_name, job_type, properties) .. method:: add_variables(self, variables) .. method:: add_args(self, args) .. method:: add_query(self, query) .. method:: add_query_uri(self, query_uri) .. method:: add_jar_file_uris(self, jars) .. method:: add_archive_uris(self, archives) .. method:: add_file_uris(self, files) .. method:: add_python_file_uris(self, pyfiles) .. method:: set_main(self, main_jar, main_class) .. method:: set_python_main(self, main) .. method:: set_job_name(self, name) .. method:: build(self) .. py:class:: _DataProcOperation(dataproc_api, operation, num_retries) Bases: :class:`airflow.utils.log.logging_mixin.LoggingMixin` Continuously polls Dataproc Operation until it completes. .. method:: wait_for_done(self) .. method:: get(self) .. method:: _check_done(self) .. method:: _raise_error(self) .. py:class:: DataProcHook(gcp_conn_id='google_cloud_default', delegate_to=None, api_version='v1beta2') Bases: :class:`airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook` Hook for Google Cloud Dataproc APIs. .. method:: get_conn(self) Returns a Google Cloud Dataproc service object. .. method:: get_cluster(self, project_id, region, cluster_name) .. method:: submit(self, project_id, job, region='global', job_error_states=None) .. method:: create_job_template(self, task_id, cluster_name, job_type, properties) .. method:: wait(self, operation) Awaits for Google Cloud Dataproc Operation to complete. .. method:: cancel(self, project_id, job_id, region='global') Cancel a Google Cloud DataProc job. :param project_id: Name of the project the job belongs to :type project_id: str :param job_id: Identifier of the job to cancel :type job_id: int :param region: Region used for the job :type region: str :returns A Job json dictionary representing the canceled job