airflow.providers.common.ai.example_dags.example_langchain_tool_agent¶

ReAct tool-calling agent with LangChain – research and report pipeline.

Demonstrates the “agent as a task” pattern using a LangChain ReAct agent that autonomously decides which tools to call, composed with common.ai’s LLMOperator for report formatting and AIP-90 HITL operators for human review.

Unlike RAG examples (fixed pipeline: retrieve then synthesize), this agent’s tool-call sequence is determined by the LLM at runtime. The agent might call zero tools or ten tools depending on the question. This is the canonical “agent as a task” pattern: Airflow handles scheduling, retry, connections, and the surrounding workflow; the LangChain agent handles internal reasoning.

example_langchain_tool_agent (manual trigger):

prompt_review (HITLEntryOperator)
    -> prepare_tools (@task)
    -> run_research_agent (@task)
    -> format_report (LLMOperator)
    -> report_approval (ApprovalOperator)

What this makes visible that running an agent alone hides:

The question goes through human review before the agent runs.
The agent’s raw findings are a visible XCom value between tasks.
Report formatting is a separate, independently retryable LLM call.
The formatted report requires human approval before delivery.

Contrast with AIP-99’s AgentOperator:

AIP-99’s AgentOperator uses PydanticAI for agent execution. This example uses LangChain’s create_agent with LangChain-native @tool definitions. Users with existing LangChain tools (700+ integrations) can use them directly without rewriting as PydanticAI tools.

Before running:

Install LangChain packages:

pip install langchain langchain-openai langchain-text-splitters \
            langchain-community faiss-cpu

Create an LLM connection of type langchain named langchain_default (or the value of LLM_CONN_ID below) for your chosen model provider. Set password to your API key; the host field is optional and only needed for custom endpoints / Ollama.
Optionally place a knowledge base directory at DOCS_PATH and a survey CSV at SURVEY_CSV_PATH. If DOCS_PATH is empty, sample documents about Apache Airflow are created automatically.

Attributes¶

`LLM_CONN_ID`
`LLM_MODEL`
`EMBEDDING_MODEL`
`DOCS_PATH`
`SURVEY_CSV_PATH`
`INDEX_PERSIST_DIR`
`DEFAULT_QUESTION`
`SAMPLE_DOCUMENTS`
`REPORT_SYSTEM_PROMPT`

Functions¶

example_langchain_tool_agent()

Research agent with LangChain tools and human review.

Module Contents¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.LLM_CONN_ID = 'langchain_default'[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.LLM_MODEL[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.EMBEDDING_MODEL[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.DOCS_PATH[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.SURVEY_CSV_PATH[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.INDEX_PERSIST_DIR[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.DEFAULT_QUESTION = 'What percentage of Airflow users are on Kubernetes? Also check what the documentation says...[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.SAMPLE_DOCUMENTS[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.REPORT_SYSTEM_PROMPT = 'You are a technical report writer. Format the research findings into a clear, well-structured...[source]¶

airflow.providers.common.ai.example_dags.example_langchain_tool_agent.example_langchain_tool_agent()[source]¶

Research agent with LangChain tools and human review.

Task graph:

prompt_review (HITLEntryOperator)
    -> prepare_tools (@task)
    -> run_research_agent (@task)
    -> format_report (LLMOperator)
    -> report_approval (ApprovalOperator)

The agent uses LangChain’s create_agent with a ReAct reasoning loop. It autonomously decides which tools to call – knowledge base search, survey data query, web search, or current-time lookup – based on the user’s question. The number and sequence of tool calls is determined by the LLM at runtime.

The surrounding Airflow DAG provides what the agent cannot: human review of the question (HITLEntryOperator), formatted report generation (LLMOperator), and human approval of the final output (ApprovalOperator).