BLACKBOX AI routes requests to the best available providers for your model. By default, requests are load balanced across top providers to maximize uptime. You can customize routing behavior in two ways:Documentation Index
Fetch the complete documentation index at: https://docs.blackbox.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Provider Prefix Shorthand — Prepend a provider slug to the model name:
{provider}/{model}example:azure/gpt-5.2 - Provider Object — Pass a
providerobject in the request body for fine-grained control
Provider Prefix Shorthand
The simplest way to route a request to a specific provider is to prefix the model name with the provider slug:claude-sonnet-4.5 (which is load-balanced across all available providers), you can send it to bedrock/claude-sonnet-4.5 to force the request through Amazon Bedrock.
Provider prefix routing is currently supported on enterprise.blackbox.ai. Use
https://enterprise.blackbox.ai/chat/completions as the base URL.Supported Provider Prefixes
| Prefix | Routes to |
|---|---|
anthropic/ | Anthropic |
openai/ | OpenAI |
azure/ | Azure |
google/ | |
google-vertex/ | Google (Vertex AI) |
vertex/ | Google (Vertex AI) |
together/ | Together AI |
deepinfra/ | DeepInfra |
fireworks/ | Fireworks AI |
groq/ | Groq |
bedrock/ | Amazon Bedrock |
aws-bedrock/ | Amazon Bedrock |
mistral/ | Mistral |
Examples
Route Claude to Amazon Bedrock
Route Llama to Together AI
Route GPT-4o Mini to Azure
Route Gemini to Google Vertex
Route Mistral Large to Mistral
How It Works
When you use the provider prefix shorthand:- The provider prefix is stripped from the model name (e.g.,
bedrock/claude-sonnet-4.5→claude-sonnet-4.5) - The stripped model name is matched against available model configurations
- A provider preference is automatically added to route the request to the specified provider
provider object with order:
Using with Full Model Paths
You can also use the provider prefix with fullblackboxai/ model paths:
blackboxai/anthropic/claude-sonnet-4.5 through Amazon Bedrock.
Provider availability varies by model. Not all providers host all models. If the specified provider doesn’t host your model, the request may fall back to another provider or return an error.
Provider Object (Advanced)
For more fine-grained control over routing, use theprovider object in your request body. This gives you access to sorting, fallback policies, performance thresholds, and more.
Available Providers
Common provider slugs you can use withorder, only, and ignore:
| Provider | Slug |
|---|---|
| Anthropic | anthropic |
| OpenAI | openai |
| Azure | azure |
google-vertex | |
| Together AI | together |
| DeepInfra | deepinfra |
| Fireworks AI | fireworks |
| Groq | groq |
| AWS Bedrock | bedrock |
| Mistral | mistral |
Provider availability varies by model. Not all providers host all models. If you specify a provider that doesn’t host your requested model, it will be skipped.
Provider Object
Theprovider object can contain the following fields:
| Field | Type | Default | Description |
|---|---|---|---|
sort | string | object | - | Sort providers by "price", "throughput", or "latency" |
order | string[] | - | List of provider slugs to try in order (e.g., ["anthropic", "openai"]) |
only | string[] | - | List of provider slugs to allow for this request |
ignore | string[] | - | List of provider slugs to skip for this request |
allow_fallbacks | boolean | true | Whether to allow backup providers when the primary is unavailable |
require_parameters | boolean | false | Only use providers that support all parameters in your request |
data_collection | ”allow” | “deny" | "allow” | Control whether to use providers that may store data |
quantizations | string[] | - | List of quantization levels to filter by (e.g., ["int4", "int8"]) |
preferred_min_throughput | number | object | - | Preferred minimum throughput (tokens/sec) |
preferred_max_latency | number | object | - | Preferred maximum latency (seconds) |
Provider Sorting
Control how providers are prioritized for your request. By default, BLACKBOX AI load balances based on price while accounting for uptime.Sort by Price
Route to the lowest-cost provider:Sort by Throughput
Route to the highest-throughput provider for faster token generation:Sort by Latency
Route to the lowest-latency provider for faster time-to-first-token:Ordering Specific Providers
Use theorder field to specify which providers to try first, in order of preference:
Allowing Only Specific Providers
Use theonly field to restrict requests to specific providers:
Ignoring Providers
Use theignore field to exclude specific providers from routing:
Disabling Fallbacks
By default, if your preferred provider fails, BLACKBOX AI will try other available providers. To disable this behavior:allow_fallbacks: false, if the specified provider fails, the request will return an error instead of trying other providers.
Data Collection Policy
Control whether to use providers that may store or train on your data:"allow"(default): Allow providers that may store data non-transiently"deny": Only use providers that do not collect user data
Requiring Parameter Support
Ensure your request only goes to providers that support all specified parameters:Performance Thresholds
Set minimum throughput or maximum latency preferences to filter providers based on performance:Minimum Throughput
Prefer providers with at least a certain throughput (tokens per second):Maximum Latency
Prefer providers with latency below a certain threshold (in seconds):Percentile Options
Performance thresholds support the following percentile cutoffs:p50- Median performance (50% of requests perform better)p75- 75th percentilep90- 90th percentile (recommended for most use cases)p99- 99th percentile (strictest)
Performance thresholds are preferences, not hard requirements. Providers that don’t meet the threshold are deprioritized but not excluded entirely.
Quantization Filtering
Filter providers by model quantization level:Available Quantization Levels
int4- Integer 4-bitint8- Integer 8-bitfp4- Floating point 4-bitfp6- Floating point 6-bitfp8- Floating point 8-bitfp16- Floating point 16-bitbf16- Brain floating point 16-bitfp32- Floating point 32-bit
Combining Options
You can combine multiple provider options for fine-grained control:- Sorts providers by price (lowest first)
- Excludes Azure from consideration
- Only uses providers that don’t collect data
- Prefers providers with at least 30 tokens/sec throughput at p90
Response Provider Field
The response includes aprovider field indicating which provider served the request: