Request Body Examples
Tool calling with BLACKBOX AI involves three key steps. Here are the essential request body formats for each step:Step 1: Inference Request with Tools
Step 2: Tool Execution (Client-Side)
After receiving the model’s response withtool_calls, execute the requested tool locally and prepare the result:
Step 3: Inference Request with Tool Results
tools parameter must be included in every request (Steps 1 and 3) to validate the tool schema on each call.
Tool Calling Example
Here is typescript code that gives LLMs the ability to call an external API — in this case Project Gutenberg, to search for books. First, let’s do some basic setup:Define the Tool
Next, we define the tool that we want to call. Remember, the tool is going to get requested by the LLM, but the code we are writing here is ultimately responsible for executing the call and returning the results to the LLM.Tool use and tool results
Let’s make the first BLACKBOX AI API call to the model:tool_calls, and a tool_calls array. In a generic LLM response-handler, you would want to check the finish_reason before processing tool calls, but here we will assume it’s the case. Let’s keep going, by processing the tool call:
- Our original request
- The LLM’s response (containing a tool call request)
- The result of the tool call (a json object returned from the Project Gutenberg API)
Interleaved Thinking
Interleaved thinking allows models to reason between tool calls, enabling more sophisticated decision-making after receiving tool results. This feature helps models chain multiple tool calls with reasoning steps in between and make nuanced decisions based on intermediate results. For comprehensive information about reasoning tokens and configuration, see the Reasoning and Interleaved Thinking documentation.How Interleaved Thinking Works
With interleaved thinking, the model can:- Reason about the results of a tool call before deciding what to do next
- Chain multiple tool calls with reasoning steps in between
- Make more nuanced decisions based on intermediate results
- Provide transparent reasoning for its tool selection process
Enabling Reasoning with Tool Calls
To enable reasoning with tool calls, include thereasoning parameter in your request:
Example: Multi-Step Research with Reasoning
Here’s an example showing how a model might use interleaved thinking to research a topic across multiple sources:- Initial Thinking: “I need to research electric vehicle environmental impact. Let me start with academic papers to get peer-reviewed research.”
-
First Tool Call:
search_academic_papers({"query": "electric vehicle lifecycle environmental impact", "field": "environmental science"}) - After First Tool Result: “The papers show mixed results on manufacturing impact. I need current statistics to complement this academic research.”
-
Second Tool Call:
get_latest_statistics({"topic": "electric vehicle carbon footprint", "year": 2024}) - After Second Tool Result: “Now I have both academic research and current data. Let me search for manufacturing-specific studies to address the gaps I found.”
-
Third Tool Call:
search_academic_papers({"query": "electric vehicle battery manufacturing environmental cost", "field": "materials science"}) - Final Analysis: Synthesizes all gathered information into a comprehensive response.
Preserving Reasoning Context
When using tools with reasoning models, you can preserve reasoning context across multiple API calls. This is particularly useful for complex workflows where the model needs to maintain its reasoning chain.Best Practices for Interleaved Thinking
- Clear Tool Descriptions: Provide detailed descriptions so the model can reason about when to use each tool
- Structured Parameters: Use well-defined parameter schemas to help the model make precise tool calls
- Context Preservation: Maintain conversation context across multiple tool interactions using
reasoning_details - Error Handling: Design tools to provide meaningful error messages that help the model adjust its approach
- Reasoning Budget: Consider setting appropriate
max_tokensoreffortlevels based on task complexity
Implementation Considerations
When implementing interleaved thinking:- Models may take longer to respond due to additional reasoning steps
- Token usage will be higher due to the reasoning process
- The quality of reasoning depends on the model’s capabilities
- Some models may be better suited for this approach than others
- Reasoning tokens are charged as output tokens
A Simple Agentic Loop
In the example above, the calls are made explicitly and sequentially. To handle a wide variety of user inputs and tool calls, you can use an agentic loop. Here’s an example of a simple agentic loop (using the sametools and initial messages as above):
Best Practices and Advanced Patterns
Function Definition Guidelines
When defining tools for LLMs, follow these best practices: Clear and Descriptive Names: Use descriptive function names that clearly indicate the tool’s purpose.Streaming with Tool Calls
When using streaming responses with tool calls, handle the different content types appropriately:Tool Choice Configuration
Control tool usage with thetool_choice parameter:
Parallel Tool Calls
Control whether multiple tools can be called simultaneously with theparallel_tool_calls parameter (default is true for most models):
parallel_tool_calls is false, the model will only request one tool call at a time instead of potentially multiple calls in parallel.
Multi-Tool Workflows
Design tools that work well together:Example UseCase: Tool Calling with gpt-5.3-codex
GPT-5.3-Codex is OpenAI’s most capable agentic coding model, designed specifically for multi-turn tool calling workflows. This section covers how to use it correctly with both the Chat Completions API and the Responses API to avoid common pitfalls.Chat Completions: Complete Multi-Turn Example
Here is a complete, working example of a multi-turn tool calling loop withgpt-5.3-codex using the Chat Completions API. This pattern is what coding agents (like Codex CLI) use internally.
Step-by-Step Turn Walkthrough
The agentic loop above handles everything automatically. This section breaks down what happens at each turn so you can see exactly how messages flow.Turn 1 — Initial request
Send the user’s message and tool definitions:finish_reason: "tool_calls" and a tool_calls array containing the function name, arguments (as a JSON string), and a unique id.
Turn 1 — Execute the tool and send the result
Append the assistant message (withtool_calls) to the messages array, execute the tool locally, then append the tool result and send the next request:
Three critical rules:
- The assistant message with
tool_callsmust be appended before the tool result tool_call_idin the tool result must exactly match theidfrom the tool call- Tool result
contentmust be a string — useJSON.stringify()for objects
Conversation History Shape
After two tool calls (e.g., read → edit), themessages array looks like this:
assistant (with tool_calls) → tool (with matching tool_call_id). Dropping any message in the chain will cause the model to fall back to text responses.
Multi-Turn with User Follow-up Messages
After the model completes a task and responds with text, you can continue the conversation by appending a new user message. This lets users ask follow-up questions based on tool results without starting over.The follow-up user message goes after the assistant’s text response, not in the middle of a tool call sequence. The model sees the full conversation context including previous tool results.
Tool Choice Options
Control how the model uses tools with thetool_choice parameter:
| Value | Behavior |
|---|---|
"auto" | The model decides whether to call a tool (recommended) |
"required" | The model must call at least one tool |
"none" | The model cannot call any tools |
Use Case: Coding Agent
A coding agent gives the model a set of file system and terminal tools and runs an agentic loop — calling the API, executing whatever tools the model requests, and feeding the results back — until the model returns a plain text response with no further tool calls. Define seven SWE tools using the Chat Completions nested format:Python
Python
The agent loop continues until the model returns a response with no
tool_calls. Always set a max_turns guard to prevent runaway loops."Read main.py and tell me what the entry point function does.""Write a file /tmp/utils.py with a helper function for parsing JSON, then read it back to confirm.""Search app.py for all lines containing 'TODO' and list their line numbers.""Edit config.py: replace DEBUG = False with DEBUG = True, then verify the change.""Run python3 tests/test_api.py and report any failures.""List the project root and find all TypeScript files under src/."
Key Requirements for gpt-5.3-codex Tool Calling
Follow these requirements to ensure reliable tool calling: 1. Always includetools in every request
The tools array must be sent with every API call in the loop, not just the first one. The model needs to see the available tools on each turn.
2. Preserve the full message chain
Every assistant message (including those with tool_calls) and every tool result must be appended to the messages array. Dropping any message breaks the conversation chain and causes the model to fall back to text.
tool_call_id exactly
Each tool result must reference the exact id from the corresponding tool call. A mismatch causes the model to ignore the result.
4. Tool result content must be a string
The content field in tool result messages must be a string. If your tool returns an object, serialize it with JSON.stringify().
5. Use tool_choice: "auto" (recommended)
For most use cases, tool_choice: "auto" gives the best results. The model will call tools when appropriate and respond with text when the task is complete. Use tool_choice: "required" only when you want to force a tool call on every turn.
Using Reasoning with Tool Calling
For complex multi-step tasks, enabling reasoning improves tool calling reliability. The model will think through its approach before deciding which tool to call.| Reasoning Effort | Best For |
|---|---|
"low" | Simple, single-tool tasks |
"medium" | General coding tasks (recommended default) |
"high" | Complex multi-file refactors, debugging |
Format Comparison
| Aspect | Chat Completions (/chat/completions) | Responses API (/v1/responses) |
|---|---|---|
| Tool definition | Nested: tools[].function.name | Flat: tools[].name |
| Tool result role | "role": "tool" | "type": "function_call_output" |
| Tool result ID field | "tool_call_id" | "call_id" |
| Message format | {"role": "user", "content": "..."} | {"type": "message", "role": "user", "content": [...]} |
| System prompt | {"role": "system", "content": "..."} | "instructions": "..." |
Chat Completions Tool Format
Responses API Tool Format
Troubleshooting: Model Returns Text Instead of Tool Calls
Ifgpt-5.3-codex responds with plain text instead of calling tools:
| Issue | Solution |
|---|---|
| Wrong tool format for endpoint | Use nested format for /chat/completions, flat format for /v1/responses |
tools array missing from request | Include tools in every request, not just the first one |
Missing tool_call_id in tool results | Ensure each tool result has tool_call_id matching the call’s id |
Tool result content is not a string | Use JSON.stringify() to convert objects to strings |
| Broken message chain | Append every assistant message (including tool_calls) before tool results |
| Vague user message | Be specific: “Read src/auth.py” instead of “Look at the code” |
| No system prompt | Include a system prompt like “You are a coding agent. Use tools to complete tasks.” |