AI Agent and LLM Nodes
Build multi-step intelligent flows with n8n's AI Agent node connected to OpenAI, Anthropic Claude and Google Gemini.
In this chapter
Since 2024 the most powerful part of n8n is its AI and LangChain node set. The AI Agent node turns an LLM (OpenAI, Claude, Gemini) into a real 'agent' via system prompt + tools + memory: it doesn't just answer, it uses tools and runs multi-step. In this chapter you will learn the AI Agent's structure, model connections, tool calling, chat memory and structured output generation.
Topics
- AI Agent node: system prompt + tools + memory
- Connecting OpenAI, Anthropic and Gemini
- Tool calling: giving the agent capabilities
- Chat memory and conversation continuity
- Structured output with parsers (Zod / JSON schema)
- Multi-step reasoning and guardrails
AI Agent node: the building blocks
AI Agent is an orchestrator — three sub-connections attach to it: Chat Model (required — which LLM), Memory (optional — conversation continuity), and Tool (optional but the real value — capabilities the agent can use). When you add an AI Agent node, three '+' icons appear below it; you wire the right node into each. The System Message parameter is critical: it defines who the agent is, what it does and what it doesn't do.
Chat Models: OpenAI, Anthropic, Gemini, Ollama
n8n provides a Chat Model node for every major provider. OpenAI Chat Model (GPT-5, GPT-4.x): widest ecosystem, mature function calling. Anthropic Chat Model (Claude Opus/Sonnet/Haiku): long context, strong reasoning, native tool calling. Google Gemini Chat Model: strong multimodal, fast, big context. Ollama Chat Model: run local models (Llama, Qwen, Mistral) with zero API cost — ideal for private data. Switching is a one-line change: don't rebuild the AI Agent, just swap the Chat Model node.
Tool calling: giving the agent capabilities
What makes an LLM an agent is its ability to call tools. In n8n you can attach: HTTP Request Tool (to call any external API — 'fetch the weather at this URL'), Code Tool (custom JavaScript), Custom n8n Workflow Tool (call another workflow as a tool — the most powerful). For each tool the 'description' field is critical — the agent reads it to decide when to call. Write it crisply: 'send_email: sends a notification to the user's email; args: email, subject, body.' Vague descriptions = the agent picks the wrong tool.
Chat Memory: conversation continuity
Without Memory the agent forgets everything between messages. Three main options. Simple Memory (in-memory, lost on restart — testing only). Window Buffer Memory (last N messages — simple chatbot). Postgres Chat Memory or Redis Chat Memory (production: persisted per user, survives restarts). The 'Session ID' parameter is critical: it's the key that groups a user's messages — in Telegram use chat.id, on web use the session token. The wrong Session ID = one user's answer goes to another.
System prompt: the agent's identity
The System Message defines 'who' the agent is. A good system prompt has 4 parts: (1) Role — 'You are our company's support agent'; (2) Scope — 'Answer only product questions; redirect elsewhere otherwise'; (3) Format — 'Reply in Markdown, short paragraphs'; (4) Limits — 'Never promise a price; say so when you don't know'. Mention tools too: 'When the customer asks about an order, always call get_order.'
Structured output with JSON schemas
To get structured data from an agent instead of free text, attach a 'Structured Output Parser'. Provide a Zod-like schema: { intent: string, priority: 'high'|'medium'|'low', tags: string[] }. The agent returns JSON conforming to the schema; n8n parses it and downstream nodes use $json.intent, $json.priority. This pattern is gold for 'classify → IF branch → action' flows because you stop parsing free text.
Cost, latency and model selection
AI Agent calls are the most expensive nodes in money and time. Practical rules: small/medium classification tasks → small models (Haiku / GPT-mini) — good enough and 10x cheaper; long-document reasoning → Opus/Sonnet; private internal data → Ollama (local, free). You can see token usage in $json; build an IF + Set 'daily token counter' and alert when it crosses a threshold. Tool calls round-trip the model — too many tool definitions multiply latency; only attach what you actually use.
This chapter's workflow (n8n editor view)