airflow.providers.http.sensors.http

Module Contents

Classes

HttpSensor

Execute HTTP GET statement; return False on failure 404 Not Found or response_check returning False.

class airflow.providers.http.sensors.http.HttpSensor(*, endpoint, http_conn_id='http_default', method='GET', request_params=None, request_kwargs=None, headers=None, response_error_codes_allowlist=None, response_check=None, extra_options=None, tcp_keep_alive=True, tcp_keep_alive_idle=120, tcp_keep_alive_count=20, tcp_keep_alive_interval=30, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]

Bases: airflow.sensors.base.BaseSensorOperator

Execute HTTP GET statement; return False on failure 404 Not Found or response_check returning False.

HTTP Error codes other than 404 (like 403) or Connection Refused Error would raise an exception and fail the sensor itself directly (no more poking). To avoid failing the task for other codes than 404, the argument response_error_codes_allowlist can be passed with the list containing all the allowed error status codes, like ["404", "503"] To skip error status code check at all, the argument extra_option can be passed with the value {'check_response': False}. It will make the response_check be execute for any http status code.

The response check can access the template context to the operator:

def response_check(response, task_instance):
    # The task_instance is injected, so you can pull data form xcom
    # Other context variables such as dag, ds, execution_date are also available.
    xcom_data = task_instance.xcom_pull(task_ids="pushing_task")
    # In practice you would do something more sensible with this data..
    print(xcom_data)
    return True


HttpSensor(task_id="my_http_sensor", ..., response_check=response_check)

See also

For more information on how to use this operator, take a look at the guide: HttpSensor

Parameters
  • http_conn_id (str) – The http connection to run the sensor against

  • method (str) – The HTTP request method to use

  • endpoint (str) – The relative part of the full url

  • request_params (dict[str, Any] | None) – The parameters to be added to the GET url

  • headers (dict[str, Any] | None) – The HTTP headers to be added to the GET request

  • response_error_codes_allowlist (list[str] | None) – An allowlist to return False on poke(), not to raise exception. If the None value comes in, it is assigned [“404”] by default, for backward compatibility. When you also want 404 Not Found to raise the error, explicitly deliver the blank list [].

  • response_check (Callable[Ellipsis, bool] | None) – A check against the ‘requests’ response object. The callable takes the response object as the first positional argument and optionally any number of keyword arguments available in the context dictionary. It should return True for ‘pass’ and False otherwise.

  • extra_options (dict[str, Any] | None) – Extra options for the ‘requests’ library, see the ‘requests’ documentation (options to modify timeout, ssl, etc.)

  • tcp_keep_alive (bool) – Enable TCP Keep Alive for the connection.

  • tcp_keep_alive_idle (int) – The TCP Keep Alive Idle parameter (corresponds to socket.TCP_KEEPIDLE).

  • tcp_keep_alive_count (int) – The TCP Keep Alive count parameter (corresponds to socket.TCP_KEEPCNT)

  • tcp_keep_alive_interval (int) – The TCP Keep Alive interval parameter (corresponds to socket.TCP_KEEPINTVL)

  • deferrable (bool) – If waiting for completion, whether to defer the task until done, default is False

template_fields: Sequence[str] = ('endpoint', 'request_params', 'headers')[source]
poke(context)[source]

Override when deriving this class.

execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

execute_complete(context, event=None)[source]

Was this entry helpful?