BLACKBOX AI routes requests to the best available providers for your model. By default, requests are load balanced across top providers to maximize uptime. You can customize routing behavior using the provider object in your request body.
Available Providers
Common provider slugs you can use with order, only, and ignore:
| Provider | Slug |
|---|
| Anthropic | anthropic |
| OpenAI | openai |
| Azure | azure |
| Google | google |
| Together AI | together |
| DeepInfra | deepinfra |
| Fireworks AI | fireworks |
| Groq | groq |
| AWS Bedrock | bedrock |
| Mistral | mistral |
Provider availability varies by model. Not all providers host all models. If you specify a provider that doesn’t host your requested model, it will be skipped.
Provider Object
The provider object can contain the following fields:
| Field | Type | Default | Description |
|---|
sort | string | object | - | Sort providers by "price", "throughput", or "latency" |
order | string[] | - | List of provider slugs to try in order (e.g., ["anthropic", "openai"]) |
only | string[] | - | List of provider slugs to allow for this request |
ignore | string[] | - | List of provider slugs to skip for this request |
allow_fallbacks | boolean | true | Whether to allow backup providers when the primary is unavailable |
require_parameters | boolean | false | Only use providers that support all parameters in your request |
data_collection | ”allow” | “deny" | "allow” | Control whether to use providers that may store data |
quantizations | string[] | - | List of quantization levels to filter by (e.g., ["int4", "int8"]) |
preferred_min_throughput | number | object | - | Preferred minimum throughput (tokens/sec) |
preferred_max_latency | number | object | - | Preferred maximum latency (seconds) |
Provider Sorting
Control how providers are prioritized for your request. By default, BLACKBOX AI load balances based on price while accounting for uptime.
Sort by Price
Route to the lowest-cost provider:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"sort": "price"
}
}'
Sort by Throughput
Route to the highest-throughput provider for faster token generation:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"sort": "throughput"
}
}'
Sort by Latency
Route to the lowest-latency provider for faster time-to-first-token:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"sort": "latency"
}
}'
Ordering Specific Providers
Use the order field to specify which providers to try first, in order of preference:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"order": ["together", "deepinfra", "fireworks"]
}
}'
The router will try providers in the specified order. If none are available, it will fall back to other providers unless fallbacks are disabled.
Allowing Only Specific Providers
Use the only field to restrict requests to specific providers:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/anthropic/claude-sonnet-4",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"only": ["anthropic"]
}
}'
Restricting to specific providers may reduce fallback options and limit request recovery if the specified provider is unavailable.
Ignoring Providers
Use the ignore field to exclude specific providers from routing:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"ignore": ["azure", "aws"]
}
}'
Disabling Fallbacks
By default, if your preferred provider fails, BLACKBOX AI will try other available providers. To disable this behavior:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"order": ["together"],
"allow_fallbacks": false
}
}'
With allow_fallbacks: false, if the specified provider fails, the request will return an error instead of trying other providers.
Data Collection Policy
Control whether to use providers that may store or train on your data:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"data_collection": "deny"
}
}'
"allow" (default): Allow providers that may store data non-transiently
"deny": Only use providers that do not collect user data
Requiring Parameter Support
Ensure your request only goes to providers that support all specified parameters:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"response_format": {"type": "json_object"},
"provider": {
"require_parameters": true
}
}'
This is useful when using features like JSON mode or specific sampling parameters that not all providers support.
Set minimum throughput or maximum latency preferences to filter providers based on performance:
Minimum Throughput
Prefer providers with at least a certain throughput (tokens per second):
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"preferred_min_throughput": {
"p90": 50
}
}
}'
Maximum Latency
Prefer providers with latency below a certain threshold (in seconds):
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"preferred_max_latency": {
"p90": 3
}
}
}'
Percentile Options
Performance thresholds support the following percentile cutoffs:
p50 - Median performance (50% of requests perform better)
p75 - 75th percentile
p90 - 90th percentile (recommended for most use cases)
p99 - 99th percentile (strictest)
Performance thresholds are preferences, not hard requirements. Providers that don’t meet the threshold are deprioritized but not excluded entirely.
Quantization Filtering
Filter providers by model quantization level:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.1-8b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"quantizations": ["fp8", "fp16"]
}
}'
Available Quantization Levels
int4 - Integer 4-bit
int8 - Integer 8-bit
fp4 - Floating point 4-bit
fp6 - Floating point 6-bit
fp8 - Floating point 8-bit
fp16 - Floating point 16-bit
bf16 - Brain floating point 16-bit
fp32 - Floating point 32-bit
Quantized models may exhibit degraded performance for certain prompts. Lower quantization levels reduce memory requirements but may affect output quality.
Combining Options
You can combine multiple provider options for fine-grained control:
curl -X POST https://api.blackbox.ai/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "blackboxai/meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"provider": {
"sort": "price",
"ignore": ["azure"],
"data_collection": "deny",
"preferred_min_throughput": {"p90": 30}
}
}'
This example:
- Sorts providers by price (lowest first)
- Excludes Azure from consideration
- Only uses providers that don’t collect data
- Prefers providers with at least 30 tokens/sec throughput at p90
Response Provider Field
The response includes a provider field indicating which provider served the request:
{
"id": "gen-...",
"model": "meta-llama/llama-3.3-70b-instruct",
"choices": [...],
"usage": {...},
"provider": "Together"
}
This helps you track which provider was used, especially useful when using load balancing or fallbacks.