thinking blocks between tool calls across multiple turns. The model reasons before each action and again after receiving tool results, giving you visibility into its step-by-step thought process.
Adaptive Thinking (Recommended)
For Claude 4.6+ models, use{"type": "adaptive"}. The model automatically decides when and how much to think — no budget or beta header required.
Enabled with Budget Tokens (Legacy)
For pre-4.6 models or when you need explicit control over the thinking token budget. Requires theinterleaved-thinking-2025-05-14 beta header and max_tokens must be greater than budget_tokens.
For Claude 4.6 models,
{"type": "adaptive"} is recommended over {"type": "enabled", "budget_tokens": ...}. Adaptive thinking lets the model decide how much reasoning is needed, resulting in better performance and lower costs.Adaptive Thinking Response
When using adaptive thinking, the response includesthinking blocks with a signature field alongside tool_use blocks:
The
signature field on thinking blocks is used for verification when passing thinking blocks back in multi-turn conversations. Always include it when sending the assistant’s response back.Legacy Thinking Response
With legacy thinking (budget_tokens), the model may produce longer reasoning. The response structure is the same but thinking content tends to be more detailed:
How Interleaved Thinking Works
When thinking is enabled with tool calling, the model producesthinking blocks between tool calls across multiple turns. The model reasons before each action and again after receiving tool results — this is what makes it “interleaved.”
Turn 1 — Think, then call a tool
The model receives the user’s question, thinks about its approach, and calls the first tool:Turn 2 — Think again after tool result, then call another tool
After receiving the tool result, the model thinks again (this is the interleaving) and decides on the next step:Turn 3 — Think and return final answer
The model thinks one final time after the last tool result and produces the answer:Passing Thinking Blocks Back in Multi-Turn
Here’s how the full message array looks with thinking blocks included:Thinking Configuration Reference
| Mode | Config | Header Required | Best For |
|---|---|---|---|
| Adaptive | {"type": "adaptive"} | No | Claude 4.6+ models. Model decides when and how much to think. |
| Enabled (Legacy) | {"type": "enabled", "budget_tokens": N} | anthropic-beta: interleaved-thinking-2025-05-14 | Pre-4.6 models or explicit budget control. max_tokens must exceed budget_tokens. |