airflow.providers.common.ai.example_dags.example_langchain_tool_agent

ReAct tool-calling agent with LangChain – research and report pipeline.

Demonstrates the “agent as a task” pattern using a LangChain ReAct agent that autonomously decides which tools to call, composed with common.ai’s LLMOperator for report formatting and AIP-90 HITL operators for human review.

Unlike RAG examples (fixed pipeline: retrieve then synthesize), this agent’s tool-call sequence is determined by the LLM at runtime. The agent might call zero tools or ten tools depending on the question. This is the canonical “agent as a task” pattern: Airflow handles scheduling, retry, connections, and the surrounding workflow; the LangChain agent handles internal reasoning.

example_langchain_tool_agent (manual trigger):

prompt_review (HITLEntryOperator)
    -> prepare_tools (@task)
    -> run_research_agent (@task)
    -> format_report (LLMOperator)
    -> report_approval (ApprovalOperator)

What this makes visible that running an agent alone hides:

  • The question goes through human review before the agent runs.

  • The agent’s raw findings are a visible XCom value between tasks.

  • Report formatting is a separate, independently retryable LLM call.

  • The formatted report requires human approval before delivery.

Contrast with AIP-99’s AgentOperator:

AIP-99’s AgentOperator uses PydanticAI for agent execution. This example uses LangChain’s create_agent with LangChain-native @tool definitions. Users with existing LangChain tools (700+ integrations) can use them directly without rewriting as PydanticAI tools.

Before running:

  1. Install LangChain packages:

    pip install langchain langchain-openai langchain-text-splitters \
                langchain-community faiss-cpu
    
  2. Create an LLM connection of type langchain named langchain_default (or the value of LLM_CONN_ID below) for your chosen model provider. Set password to your API key; the host field is optional and only needed for custom endpoints / Ollama.

  3. Optionally place a knowledge base directory at DOCS_PATH and a survey CSV at SURVEY_CSV_PATH. If DOCS_PATH is empty, sample documents about Apache Airflow are created automatically.

Attributes

LLM_CONN_ID

LLM_MODEL

EMBEDDING_MODEL

DOCS_PATH

SURVEY_CSV_PATH

INDEX_PERSIST_DIR

DEFAULT_QUESTION

SAMPLE_DOCUMENTS

REPORT_SYSTEM_PROMPT

Functions

example_langchain_tool_agent()

Research agent with LangChain tools and human review.

Module Contents

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.LLM_CONN_ID = 'langchain_default'[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.LLM_MODEL[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.EMBEDDING_MODEL[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.DOCS_PATH[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.SURVEY_CSV_PATH[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.INDEX_PERSIST_DIR[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.DEFAULT_QUESTION = 'What percentage of Airflow users are on Kubernetes? Also check what the documentation says...[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.SAMPLE_DOCUMENTS[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.REPORT_SYSTEM_PROMPT = 'You are a technical report writer. Format the research findings into a clear, well-structured...[source]
airflow.providers.common.ai.example_dags.example_langchain_tool_agent.example_langchain_tool_agent()[source]

Research agent with LangChain tools and human review.

Task graph:

prompt_review (HITLEntryOperator)
    -> prepare_tools (@task)
    -> run_research_agent (@task)
    -> format_report (LLMOperator)
    -> report_approval (ApprovalOperator)

The agent uses LangChain’s create_agent with a ReAct reasoning loop. It autonomously decides which tools to call – knowledge base search, survey data query, web search, or current-time lookup – based on the user’s question. The number and sequence of tool calls is determined by the LLM at runtime.

The surrounding Airflow DAG provides what the agent cannot: human review of the question (HITLEntryOperator), formatted report generation (LLMOperator), and human approval of the final output (ApprovalOperator).

Was this entry helpful?