Rate Limiting
Overview
Each endpoint has its own rate limits based on requests per minute, tokens per minute, and requests per day. Exceeding these limits may result in request throttling or rejection. If you need higher limits, please contact us.
Rate Limit Tiers
We offer five tiers with increasing limits. Here's a breakdown for the Embeddings & Reranking endpoint:
Tier | Requests/Min | Tokens/Min | Requests/Day | Burst |
---|---|---|---|---|
Home Baker (Free) | 100 | 100,000 | 5,000 | 10 |
Professional Baker | 300 | 500,000 | - | 30 |
Bakery Shop | 500 | 1,000,000 | - | 50 |
Bakery Chain | 1,000 | 5,000,000 | - | 100 |
Bakery Franchise | 2,000 | 10,000,000 | - | 200 |
Custom tiers are available upon request.
Handling Rate Limits
When you hit a rate limit:
- You'll receive a
429 Too Many Requests
response - The response will include a
Retry-After
header - Wait for the specified time before retrying
Example error response:
Introduction
The Mixedbread API enables powerful text embeddings, semantic search, and document intelligence capabilities for AI-powered applications.
Pagination
Understanding cursor-based pagination across Mixedbread API endpoints. Learn how to navigate through large result sets efficiently using cursors, handle pagination parameters, and implement robust pagination logic.
Last updated: September 13, 2025