airflow.providers.google.cloud.hooks.mlengine

This module contains a Google ML Engine Hook.

Module Contents

Classes

MLEngineHook

Hook for Google ML Engine APIs.

MLEngineAsyncHook

Uses gcloud-aio library to retrieve Job details

Attributes

log

airflow.providers.google.cloud.hooks.mlengine.log[source]
class airflow.providers.google.cloud.hooks.mlengine.MLEngineHook(**kwargs)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Hook for Google ML Engine APIs.

All the methods in the hook where project_id is used must be called with keyword arguments rather than positional.

get_conn()[source]

Retrieves the connection to MLEngine.

Returns

Google MLEngine services object.

Return type

googleapiclient.discovery.Resource

create_job(job, project_id, use_existing_job_fn=None)[source]

Launches a MLEngine job and wait for it to reach a terminal state.

Parameters
  • project_id (str) – The Google Cloud project id within which MLEngine job will be launched. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • job (dict) –

    MLEngine Job object that should be provided to the MLEngine API, such as:

    {
      'jobId': 'my_job_id',
      'trainingInput': {
        'scaleTier': 'STANDARD_1',
        ...
      }
    }
    

  • use_existing_job_fn (Callable | None) – In case that a MLEngine job with the same job_id already exist, this method (if provided) will decide whether we should use this existing job, continue waiting for it to finish and returning the job object. It should accepts a MLEngine job object, and returns a boolean value indicating whether it is OK to reuse the existing job. If ‘use_existing_job_fn’ is not provided, we by default reuse the existing MLEngine job.

Returns

The MLEngine job object if the job successfully reach a terminal state (which might be FAILED or CANCELLED state).

Return type

dict

create_job_without_waiting_result(body, project_id)[source]

Launches a MLEngine job and wait for it to reach a terminal state.

Parameters
  • project_id (str) – The Google Cloud project id within which MLEngine job will be launched. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • body (dict) –

    MLEngine Job object that should be provided to the MLEngine API, such as:

    {
      'jobId': 'my_job_id',
      'trainingInput': {
        'scaleTier': 'STANDARD_1',
        ...
      }
    }
    

Returns

The MLEngine job_id of the object if the job successfully reach a terminal state (which might be FAILED or CANCELLED state).

cancel_job(job_id, project_id)[source]

Cancels a MLEngine job.

Parameters
  • project_id (str) – The Google Cloud project id within which MLEngine job will be cancelled. If set to None or missing, the default project_id from the Google Cloud connection is used.

  • job_id (str) – A unique id for the want-to-be cancelled Google MLEngine training job.

Returns

Empty dict if cancelled successfully

Raises

googleapiclient.errors.HttpError

Return type

dict

get_job(project_id, job_id)[source]

Gets a MLEngine job based on the job id.

Parameters
  • project_id (str) – The project in which the Job is located. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

  • job_id (str) – A unique id for the Google MLEngine job. (templated)

Returns

MLEngine job object if succeed.

Raises

googleapiclient.errors.HttpError

Return type

dict

create_version(model_name, version_spec, project_id)[source]

Creates the Version on Google Cloud ML Engine.

Parameters
  • version_spec (dict) – A dictionary containing the information about the version. (templated)

  • model_name (str) – The name of the Google Cloud ML Engine model that the version belongs to. (templated)

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

Returns

If the version was created successfully, returns the operation. Otherwise raises an error .

Return type

dict

set_default_version(model_name, version_name, project_id)[source]

Sets a version to be the default. Blocks until finished.

Parameters
  • model_name (str) – The name of the Google Cloud ML Engine model that the version belongs to. (templated)

  • version_name (str) – A name to use for the version being operated upon. (templated)

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

Returns

If successful, return an instance of Version. Otherwise raises an error.

Raises

googleapiclient.errors.HttpError

Return type

dict

list_versions(model_name, project_id)[source]

Lists all available versions of a model. Blocks until finished.

Parameters
  • model_name (str) – The name of the Google Cloud ML Engine model that the version belongs to. (templated)

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

Returns

return an list of instance of Version.

Raises

googleapiclient.errors.HttpError

Return type

list[dict]

delete_version(model_name, version_name, project_id)[source]

Deletes the given version of a model. Blocks until finished.

Parameters
  • model_name (str) – The name of the Google Cloud ML Engine model that the version belongs to. (templated)

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs.

Returns

If the version was deleted successfully, returns the operation. Otherwise raises an error.

Return type

dict

create_model(model, project_id)[source]

Create a Model. Blocks until finished.

Parameters
  • model (dict) – A dictionary containing the information about the model.

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

Returns

If the version was created successfully, returns the instance of Model. Otherwise raises an error.

Raises

googleapiclient.errors.HttpError

Return type

dict

get_model(model_name, project_id)[source]

Gets a Model. Blocks until finished.

Parameters
  • model_name (str) – The name of the model.

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

Returns

If the model exists, returns the instance of Model. Otherwise return None.

Raises

googleapiclient.errors.HttpError

Return type

dict | None

delete_model(model_name, project_id, delete_contents=False)[source]

Delete a Model. Blocks until finished.

Parameters
  • model_name (str) – The name of the model.

  • delete_contents (bool) – Whether to force the deletion even if the models is not empty. Will delete all version (if any) in the dataset if set to True. The default value is False.

  • project_id (str) – The Google Cloud project name to which MLEngine model belongs. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)

Raises

googleapiclient.errors.HttpError

class airflow.providers.google.cloud.hooks.mlengine.MLEngineAsyncHook(**kwargs)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseAsyncHook

Uses gcloud-aio library to retrieve Job details

sync_hook_class[source]
async get_job(job_id, session, project_id=None)[source]

Get the specified job resource by job ID and project ID.

async get_job_status(job_id, project_id=None)[source]

Polls for job status asynchronously using gcloud-aio.

Note that an OSError is raised when Job results are still pending. Exception means that Job finished with errors

Was this entry helpful?