Request Body Examples
Tool calling with BLACKBOX AI involves three key steps. Here are the essential request body formats for each step:Step 1: Inference Request with Tools
Step 2: Tool Execution (Client-Side)
After receiving the model’s response withtool_calls, execute the requested tool locally and prepare the result:
Step 3: Inference Request with Tool Results
tools parameter must be included in every request (Steps 1 and 3) to validate the tool schema on each call.
Tool Calling Example
Here is typescript code that gives LLMs the ability to call an external API — in this case Project Gutenberg, to search for books. First, let’s do some basic setup:Define the Tool
Next, we define the tool that we want to call. Remember, the tool is going to get requested by the LLM, but the code we are writing here is ultimately responsible for executing the call and returning the results to the LLM.Tool use and tool results
Let’s make the first BLACKBOX AI API call to the model:tool_calls, and a tool_calls array. In a generic LLM response-handler, you would want to check the finish_reason before processing tool calls, but here we will assume it’s the case. Let’s keep going, by processing the tool call:
- Our original request
- The LLM’s response (containing a tool call request)
- The result of the tool call (a json object returned from the Project Gutenberg API)
Interleaved Thinking
Interleaved thinking allows models to reason between tool calls, enabling more sophisticated decision-making after receiving tool results. This feature helps models chain multiple tool calls with reasoning steps in between and make nuanced decisions based on intermediate results. Important: Interleaved thinking increases token usage and response latency. Consider your budget and performance requirements when enabling this feature.How Interleaved Thinking Works
With interleaved thinking, the model can:- Reason about the results of a tool call before deciding what to do next
- Chain multiple tool calls with reasoning steps in between
- Make more nuanced decisions based on intermediate results
- Provide transparent reasoning for its tool selection process
Example: Multi-Step Research with Reasoning
Here’s an example showing how a model might use interleaved thinking to research a topic across multiple sources: Initial Request:- Initial Thinking: “I need to research electric vehicle environmental impact. Let me start with academic papers to get peer-reviewed research.”
-
First Tool Call:
search_academic_papers({"query": "electric vehicle lifecycle environmental impact", "field": "environmental science"}) - After First Tool Result: “The papers show mixed results on manufacturing impact. I need current statistics to complement this academic research.”
-
Second Tool Call:
get_latest_statistics({"topic": "electric vehicle carbon footprint", "year": 2024}) - After Second Tool Result: “Now I have both academic research and current data. Let me search for manufacturing-specific studies to address the gaps I found.”
-
Third Tool Call:
search_academic_papers({"query": "electric vehicle battery manufacturing environmental cost", "field": "materials science"}) - Final Analysis: Synthesizes all gathered information into a comprehensive response.
Best Practices for Interleaved Thinking
- Clear Tool Descriptions: Provide detailed descriptions so the model can reason about when to use each tool
- Structured Parameters: Use well-defined parameter schemas to help the model make precise tool calls
- Context Preservation: Maintain conversation context across multiple tool interactions
- Error Handling: Design tools to provide meaningful error messages that help the model adjust its approach
Implementation Considerations
When implementing interleaved thinking:- Models may take longer to respond due to additional reasoning steps
- Token usage will be higher due to the reasoning process
- The quality of reasoning depends on the model’s capabilities
- Some models may be better suited for this approach than others
A Simple Agentic Loop
In the example above, the calls are made explicitly and sequentially. To handle a wide variety of user inputs and tool calls, you can use an agentic loop. Here’s an example of a simple agentic loop (using the sametools and initial messages as above):
Best Practices and Advanced Patterns
Function Definition Guidelines
When defining tools for LLMs, follow these best practices: Clear and Descriptive Names: Use descriptive function names that clearly indicate the tool’s purpose.Streaming with Tool Calls
When using streaming responses with tool calls, handle the different content types appropriately:Tool Choice Configuration
Control tool usage with thetool_choice parameter:
Parallel Tool Calls
Control whether multiple tools can be called simultaneously with theparallel_tool_calls parameter (default is true for most models):
parallel_tool_calls is false, the model will only request one tool call at a time instead of potentially multiple calls in parallel.