Interleaved Thinking - BLACKBOX AI

Interleaved thinking allows the model to produce thinking blocks between tool calls across multiple turns. The model reasons before each action and again after receiving tool results, giving you visibility into its step-by-step thought process.

The Messages API (/v1/messages) is fully supported on the Enterprise plan using https://enterprise.blackbox.ai. On standard plans (https://api.blackbox.ai), this endpoint may not work as expected. For the best experience, use an Enterprise API key.

Adaptive Thinking (Recommended)

For Claude 4.6+ models, use {"type": "adaptive"}. The model automatically decides when and how much to think — no budget or beta header required.

import os
import requests

url = 'https://enterprise.blackbox.ai/v1/messages'
headers = {
    'Content-Type': 'application/json',
    'Authorization': f"Bearer {os.environ['BLACKBOX_API_KEY']}",
    'anthropic-version': '2023-06-01',
}

tools = [{
    'name': 'calculator',
    'description': 'Perform arithmetic operations',
    'input_schema': {
        'type': 'object',
        'properties': {
            'operation': {'type': 'string'},
            'a': {'type': 'number'},
            'b': {'type': 'number'},
        },
        'required': ['operation', 'a', 'b'],
    },
}]

def calculate(operation, a, b):
    ops = {'add': a + b, 'subtract': a - b, 'multiply': a * b, 'divide': a / b}
    return str(ops.get(operation, 'Unknown'))

messages = [
    {'role': 'user', 'content': 'Calculate (15 + 27) * 3. Use the calculator for each step.'}
]

for turn in range(10):
    result = requests.post(url, headers=headers, json={
        'model': 'blackboxai/anthropic/claude-opus-4.6',
        'max_tokens': 4000,
        'system': 'Use tools to solve problems step by step.',
        'thinking': {'type': 'adaptive'},
        'tools': tools,
        'messages': messages,
    }).json()

    content = result['content']

    # Inspect thinking, tool, and text blocks
    for block in content:
        if block['type'] == 'thinking':
            print(f"[Thinking] {block['thinking'][:100]}...")
        elif block['type'] == 'tool_use':
            print(f"[Tool] {block['name']}({block['input']})")
        elif block['type'] == 'text':
            print(f"[Answer] {block['text'][:100]}")

    if result['stop_reason'] == 'end_turn':
        break

    if result['stop_reason'] == 'tool_use':
        messages.append({'role': 'assistant', 'content': content})
        tool_results = []
        for block in content:
            if block['type'] == 'tool_use':
                answer = calculate(**block['input'])
                tool_results.append({
                    'type': 'tool_result',
                    'tool_use_id': block['id'],
                    'content': answer,
                })
        messages.append({'role': 'user', 'content': tool_results})

Enabled with Budget Tokens (Legacy)

For pre-4.6 models or when you need explicit control over the thinking token budget. Requires the interleaved-thinking-2025-05-14 beta header and max_tokens must be greater than budget_tokens.

import os
import requests

url = 'https://enterprise.blackbox.ai/v1/messages'
headers = {
    'Content-Type': 'application/json',
    'Authorization': f"Bearer {os.environ['BLACKBOX_API_KEY']}",
    'anthropic-version': '2023-06-01',
    'anthropic-beta': 'interleaved-thinking-2025-05-14',
}

tools = [{
    'name': 'calculator',
    'description': 'Perform arithmetic operations',
    'input_schema': {
        'type': 'object',
        'properties': {
            'operation': {'type': 'string'},
            'a': {'type': 'number'},
            'b': {'type': 'number'},
        },
        'required': ['operation', 'a', 'b'],
    },
}]

def calculate(operation, a, b):
    ops = {'add': a + b, 'subtract': a - b, 'multiply': a * b, 'divide': a / b}
    return str(ops.get(operation, 'Unknown'))

messages = [
    {'role': 'user', 'content': 'Calculate (15 + 27) * 3. Use the calculator for each step.'}
]

for turn in range(10):
    result = requests.post(url, headers=headers, json={
        'model': 'blackboxai/anthropic/claude-sonnet-4.5',
        'max_tokens': 16000,
        'system': 'Use tools to solve problems step by step.',
        'thinking': {'type': 'enabled', 'budget_tokens': 10000},
        'tools': tools,
        'messages': messages,
    }).json()

    content = result['content']

    for block in content:
        if block['type'] == 'thinking':
            print(f"[Thinking] {block['thinking'][:100]}...")
        elif block['type'] == 'tool_use':
            print(f"[Tool] {block['name']}({block['input']})")
        elif block['type'] == 'text':
            print(f"[Answer] {block['text'][:100]}")

    if result['stop_reason'] == 'end_turn':
        break

    if result['stop_reason'] == 'tool_use':
        messages.append({'role': 'assistant', 'content': content})
        tool_results = []
        for block in content:
            if block['type'] == 'tool_use':
                answer = calculate(**block['input'])
                tool_results.append({
                    'type': 'tool_result',
                    'tool_use_id': block['id'],
                    'content': answer,
                })
        messages.append({'role': 'user', 'content': tool_results})

For Claude 4.6 models, {"type": "adaptive"} is recommended over {"type": "enabled", "budget_tokens": ...}. Adaptive thinking lets the model decide how much reasoning is needed, resulting in better performance and lower costs.

Adaptive Thinking Response

When using adaptive thinking, the response includes thinking blocks with a signature field alongside tool_use blocks:

{
  "id": "gen_01KJRNFCEYNR4J398P8KN7MR6B",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "The user wants to add 15 and 27 using the calculator tool.",
      "signature": "EuIBCkYICxgCKkAwbYc2xREz..."
    },
    {
      "type": "tool_use",
      "id": "toolu_01BDbhXhx64gVCxzrKMAp1iZ",
      "name": "calculator",
      "input": {
        "operation": "add",
        "a": 15,
        "b": 27
      }
    }
  ],
  "model": "blackboxai/anthropic/claude-opus-4.6",
  "stop_reason": "tool_use",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 610,
    "output_tokens": 118
  }
}

The signature field on thinking blocks is used for verification when passing thinking blocks back in multi-turn conversations. Always include it when sending the assistant’s response back.

If the signature field gets corrupted (e.g. set to null by your ORM, truncated by a column limit, or modified during serialization), the API will reject the request with a 400 error. See Avoiding Invalid Thinking Signatures for common causes, reproduction examples, and fixes.

Legacy Thinking Response

With legacy thinking (budget_tokens), the model may produce longer reasoning. The response structure is the same but thinking content tends to be more detailed:

{
  "id": "gen_01KJRNFGD88W8A4NEK70CM60M3",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "The user wants me to calculate 15 + 27 using the calculator function. I need to use the calculator function with operation: \"add\", a: 15, b: 27.",
      "signature": "EoUECkYICxgCKkAHWkVekW3st3Nw..."
    },
    {
      "type": "tool_use",
      "id": "toolu_01TJ4f6X6dudFQ4LKKEpwNav",
      "name": "calculator",
      "input": {
        "operation": "add",
        "a": 15,
        "b": 27
      }
    }
  ],
  "model": "blackboxai/anthropic/claude-sonnet-4.5",
  "stop_reason": "tool_use",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 769,
    "output_tokens": 188
  }
}

How Interleaved Thinking Works

When thinking is enabled with tool calling, the model produces thinking blocks between tool calls across multiple turns. The model reasons before each action and again after receiving tool results — this is what makes it “interleaved.”

Turn 1 — Think, then call a tool

The model receives the user’s question, thinks about its approach, and calls the first tool:

{
  "id": "gen_01ABC...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "I need to calculate (15 + 27) * 3. Let me start with the addition...",
      "signature": "EuIBCkYICxgC..."
    },
    {
      "type": "text",
      "text": "I'll solve this step by step. First, the addition:"
    },
    {
      "type": "tool_use",
      "id": "toolu_01ABC...",
      "name": "calculator",
      "input": {"operation": "add", "a": 15, "b": 27}
    }
  ],
  "model": "blackboxai/anthropic/claude-opus-4.6",
  "stop_reason": "tool_use",
  "stop_sequence": null,
  "usage": { "input_tokens": 610, "output_tokens": 145 }
}

You then execute the tool and send back:

{"role": "user", "content": [
  {"type": "tool_result", "tool_use_id": "toolu_01ABC...", "content": "42"}
]}

Turn 2 — Think again after tool result, then call another tool

After receiving the tool result, the model thinks again (this is the interleaving) and decides on the next step:

{
  "id": "gen_02DEF...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "The addition gave me 42. Now I need to multiply 42 by 3...",
      "signature": "EoUECkYICxgC..."
    },
    {
      "type": "text",
      "text": "Now let me multiply the result by 3:"
    },
    {
      "type": "tool_use",
      "id": "toolu_02DEF...",
      "name": "calculator",
      "input": {"operation": "multiply", "a": 42, "b": 3}
    }
  ],
  "model": "blackboxai/anthropic/claude-opus-4.6",
  "stop_reason": "tool_use",
  "stop_sequence": null,
  "usage": { "input_tokens": 820, "output_tokens": 132 }
}

Turn 3 — Think and return final answer

The model thinks one final time after the last tool result and produces the answer:

{
  "id": "gen_03GHI...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "42 * 3 = 126. That's the final answer.",
      "signature": "EpQBCkYICxgC..."
    },
    {
      "type": "text",
      "text": "The result of (15 + 27) × 3 = **126**"
    }
  ],
  "model": "blackboxai/anthropic/claude-opus-4.6",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": { "input_tokens": 990, "output_tokens": 78 }
}

Passing Thinking Blocks Back in Multi-Turn

When passing the assistant’s response back in multi-turn, include all content blocks — including thinking blocks with their signature field. The API expects the full content array exactly as returned. Omitting thinking blocks or signatures will result in errors.

Here’s how the full message array looks with thinking blocks included:

[
  {"role": "user", "content": "Calculate (15 + 27) * 3"},

  {"role": "assistant", "content": [
    {"type": "thinking", "thinking": "I need to start with addition...", "signature": "EuIBCkYICxgC..."},
    {"type": "text", "text": "First, the addition:"},
    {"type": "tool_use", "id": "toolu_01ABC...", "name": "calculator", "input": {"operation": "add", "a": 15, "b": 27}}
  ]},

  {"role": "user", "content": [
    {"type": "tool_result", "tool_use_id": "toolu_01ABC...", "content": "42"}
  ]},

  {"role": "assistant", "content": [
    {"type": "thinking", "thinking": "Got 42. Now multiply by 3...", "signature": "EoUECkYICxgC..."},
    {"type": "text", "text": "Now multiply by 3:"},
    {"type": "tool_use", "id": "toolu_02DEF...", "name": "calculator", "input": {"operation": "multiply", "a": 42, "b": 3}}
  ]},

  {"role": "user", "content": [
    {"type": "tool_result", "tool_use_id": "toolu_02DEF...", "content": "126"}
  ]}
]

Thinking Configuration Reference

Mode	Config	Header Required	Best For
Adaptive	`{"type": "adaptive"}`	No	Claude 4.6+ models. Model decides when and how much to think.
Enabled (Legacy)	`{"type": "enabled", "budget_tokens": N}`	`anthropic-beta: interleaved-thinking-2025-05-14`	Pre-4.6 models or explicit budget control. `max_tokens` must exceed `budget_tokens`.

Tool Calling

Define tools and build multi-turn agentic loops

Best Practices

Avoid signature corruption, forced tool choice errors, and more

Messages API Overview

Headers, parameters, and supported models

Documentation Index

​Adaptive Thinking (Recommended)

​Enabled with Budget Tokens (Legacy)

​Adaptive Thinking Response

​Legacy Thinking Response

​How Interleaved Thinking Works

​Turn 1 — Think, then call a tool

​Turn 2 — Think again after tool result, then call another tool

​Turn 3 — Think and return final answer

​Passing Thinking Blocks Back in Multi-Turn

​Thinking Configuration Reference

Tool Calling

Best Practices

Messages API Overview

Adaptive Thinking (Recommended)

Enabled with Budget Tokens (Legacy)

Adaptive Thinking Response

Legacy Thinking Response

How Interleaved Thinking Works

Turn 1 — Think, then call a tool

Turn 2 — Think again after tool result, then call another tool

Turn 3 — Think and return final answer

Passing Thinking Blocks Back in Multi-Turn

Thinking Configuration Reference