Changelog
Bring Your Own BucketLink to section
Enterprise organizations can now keep all their content in object storage they own. With bring your own bucket, Mixedbread indexes and searches your AWS S3 bucket with ephemeral compute, retaining nothing beyond memory. Your documents and every derived artifact stay in your cloud account. Authenticate with an IAM role or access keys, optionally with your own KMS key, and contact us to enable it.
Store Chunk Grep and ListingLink to section
We added two lower-level chunk retrieval APIs:
stores.grepfor exact regex matching against chunk text and generated content.stores.list_chunksfor retrieving chunks by metadata filters with optional numeric sorting.
Agentic Search ObservabilityLink to section
We improved observability for agentic search so you can see what the agent actually did and trace the quality of its results. Inspect agent runs directly from the dashboard to understand and tune retrieval behavior.

Startup Accelerator CreditsLink to section
Mixedbread is partnering with the Startup Accelerator to offer credits for eligible early-stage startups. Apply through the program to get Mixedbread Search credits and build AI-native search into your product from day one.
Mixedbread Search SkillLink to section
We released the Mixedbread Search skill, an agent skill that gives your coding agent the context it needs to integrate Mixedbread perfectly into your apps. Drop it into your agent and skip the back-and-forth of piecing together SDK patterns, shortening development time from exploration to working integration.
Wholembed v3Link to section
We released Wholembed v3, our new unified omnimodal multilingual late-interaction retrieval model for Mixedbread Search. It is now the default model for new stores and improves retrieval quality across languages, modalities, and real-world search tasks, with built-in support for audio and video.
Key FeaturesLink to section
- Higher retrieval quality: Wholembed v3 delivers state-of-the-art retrieval performance across benchmarks and real-world industrial test cases, including strong results on LIMIT and BrowseComp-Plus.
- Audio and video support: Mixedbread Search can now index and retrieve audio and video content directly, alongside text and images, in new stores by default.
- Unified omnimodal retrieval: Wholembed v3 is designed to retrieve across text, audio, and vision in one system, improving robustness on noisy, heterogeneous real-world data.
Read more in the Wholembed v3 release post.
New CookbooksLink to section
We published three new cookbooks to help you build with Mixedbread Search. The Image Search cookbook guides you through building visual search applications. Chat with PDFs demonstrates how to create conversational interfaces over your documents. Agent Memory shows how to use Mixedbread as persistent memory for AI agents.
Agentic SearchLink to section
Agentic search is now available, designed for AI agents that need to retrieve information autonomously. The rewritten query is exposed in responses so you can see how the system interprets your searches. Read more in the agentic search documentation.
Web SearchLink to section
Stores now support web search as a data source. You can retrieve web results directly via the store compatible API, enabling search across both uploaded documents and web content in a single query.
Search-only API KeysLink to section
You can now create API keys with search-only scope. These restricted keys can perform searches but cannot modify stores, upload files, or access other endpoints. This is useful for client-side applications where you want to limit what the key can do.
Free TierLink to section
We launched a free plan that lets you explore Mixedbread Search without commitment. Free tier users get up to 1000 store files and access to core platform features. Token-based pricing is now displayed on the pricing page with an interactive calculator to estimate costs.
Spending LimitsLink to section
Spending limits give you control over your API costs. Set a maximum spend and the system will automatically pause requests when you approach the limit, preventing unexpected charges. Cost tracking is visible in the dashboard with breakdowns by store and product.
MCP OAuthLink to section
The MCP integration now supports OAuth authentication, making it easier to connect Claude Desktop and other MCP-compatible tools to your Mixedbread stores securely. Users can authorize access through the standard OAuth flow rather than manually configuring API keys.
Vercel Marketplace IntegrationLink to section
Mixedbread is now available on the Vercel Marketplace as a Vercel Native integration in the Searching (and Agents) categories. Install it to:
-
Connect projects in one click: we’ll add the required
MXBAI_API_KEYandMXBAI_STORE_IDenv vars to your Vercel projects automatically. -
Manage from the Vercel dashboard: monitor usage and costs with unified Vercel billing and access controls alongside your other integrations. Read more about it in our Vercel integration docs.
-
Ship faster with a starter: deploy our example Next.js app to see Mixedbread Search in action and customize from there.
Mixedbread remains fully available as a standalone platform. Use it directly or via Vercel, whichever fits your workflow.
Mixedbread Search Public BetaLink to section
We're excited to announce the public beta of Mixedbread Search, the easy-to-use search API built from the ground up for the AI era. It is a fully-managed search engine that allows you to upload your data and start searching in minutes.
Key FeaturesLink to section
- AI-native: Built for the AI era, with both humans and AI in mind
- Multi-modal: Search through text, images, tables, audio, and complex layouts. Video is coming soon
- Multi-lingual: Support for 100+ languages
- Fully-managed: No complex configuration, no complex setup, no complex code
- Low latency: Because you need results now, not in 5700 milliseconds
- Meaningfully state-of-the-art Search: On realistic BrowseComp-Plus benchmarks, LLM assistants are able to reach significantly better response accuracy with Mixedbread Search over existing search systems
Ingestion Speed Optimization (fast track)Link to section
Today we're excited to announce that we've optimized the ingestion speed and concurrency of our Mixedbread Search. We can ingest 1000s of files concurrently with sub 1 second latency. This is a significant improvement over the previous 20 files concurrency limit.
Key FeaturesLink to section
- High concurrency: 1000s of files can be ingested concurrently
- Minimal latency: most files get ingested one second after they are uploaded
Public StoresLink to section
Mixedbread public stores allow users to make stores publicly accessible, so that anyone with an API key can search them. This is useful for public documentation, public knowledge bases, and other use cases where you want to share your data with the world.
Search Latency OptimizationLink to section
We've optimized the hot path for the search latency of our search system, achieving sub 90ms latency without reranking and sub 120ms with reranking. This includes embedding generation, first stage single-vector retrieval, second stage multi-vector retrieval, and (optionally) reranking.
Key FeaturesLink to section
- Minimal latency: latency so low that it's virtually undetectable by users in most scenarios.
MaxSim CPULink to section
MaxSim CPU is a CPU-optimized version of MaxSim, a state-of-the-art similarity search operator for late-interaction models such as ColBERT and ColPali. It powers our multi-vector retrieval system, achieving sub 5ms latency on AVX2 machines.
Key FeaturesLink to section
- CPU-optimized: Optimized for CPU-based multi-vector retrieval.
- Low latency: Sub 5ms latency on AVX2 machines.
- High throughput: 10x speedup over existing CPU-based maxsim implementations.
- Open-source: Fully accessible and customizable.
Mixedbread MCPLink to section
Introducing the Mixedbread Model Context Protocol (MCP) server, a TypeScript-based integration that exposes powerful store capabilities to AI assistants like Claude Desktop. Built as an open standard, it enables secure and controlled access to external data sources.
Key FeaturesLink to section
- Claude Desktop Integration: Seamless integration with Claude and other MCP-compatible AI assistants
- Store Operations: Direct interaction with stores through standardized MCP tools
- Semantic Search: Enable AI assistants to search and retrieve information using natural language
- File Management: Upload, manage, and search through documents in stores
- Secure Data Access: Controlled interface for AI systems to access external data sources
InstallationLink to section
npm install -g @mixedbread/mcpmxbai CLILink to section
Introducing the Mixedbread CLI, a command-line interface for managing Mixedbread's services directly from your terminal. Built on top of the Mixedbread SDK, it provides efficient command-line access to all core platform features.
Key FeaturesLink to section
- Store Management: Create, list, update, and manage stores with comprehensive control
- Store File Upload & Processing: Upload files with intelligent processing strategies, metadata, and batch operations
- Semantic Search: Search through stores using natural language queries with advanced filtering
- Question Answering: Get AI-powered answers based on your store content
- Intelligent Sync: Sync files with change detection and smart processing strategies to stores
InstallationLink to section
npm install -g @mixedbread/cliPlatform AlphaLink to section
Today we're excited to announce the alpha of the Mixedbread platform, bringing together all our capabilities in a unified API. Experience state-of-the-art embeddings, reranking, document parsing, and stores in one integrated solution.
Now Available in AlphaLink to section
- Embeddings API: Transform text into semantic vectors with our award-winning models
- Reranking API: Boost search relevance with advanced cross-encoder models
- Document Parsing API: Extract LLM-ready content from PDFs, DOCX, PPTX, and more
- Stores API: Fully-managed multi-modal search with automatic ingestion pipelines
mxbai-rerank-v2Link to section
The second generation of our reranking models features reinforcement learning (GRPO), extended context handling, and support for 100+ languages. These models are 8x faster than comparable alternatives while achieving higher accuracy across all benchmarks.
ModelsLink to section
- mxbai-rerank-base-v2 (0.5B): Balanced performance for production use
- mxbai-rerank-large-v2 (1.5B): Maximum accuracy for critical applications
mxbai-embed-xsmall-v1Link to section
Our smallest and most efficient English embedding model. Perfect for edge deployments and resource-constrained environments, it delivers competitive performance in an extra small footprint.
Batched - Dynamic Batching LibraryLink to section
We've open-sourced Batched, our dynamic batching library that powers Mixedbread's inference infrastructure. Achieve up to 10x throughput improvements with minimal latency impact.
Baguetter - Retrieval Testing FrameworkLink to section
Introducing Baguetter, our open-source framework for testing and evaluating retrieval systems. Make your search better with comprehensive testing tools designed for real-world scenarios.
BMX AlgorithmLink to section
We've developed BMX, a modern take on the classic BM25 algorithm. Combining lexical and semantic signals, BMX delivers superior hybrid search performance.
deepset-mxbai-embed-de-large-v1Link to section
In collaboration with deepset, we've released a German/English embedding model that sets new performance standards among open source alternatives.
mxbai-colbert-large-v1Link to section
Our ColBERT model brings late interaction capabilities. Built on our mxbai-embed-large-v1 architecture, it achieves state-of-the-art performance on 13 BEIR benchmarks.
mxbai-embed-large-v1Link to section
Our flagship English embedding model delivers state-of-the-art performance, outperforming closed source alternatives like OpenAI's text-embedding-v3.
mxbai-embed-2d-large-v1Link to section
The world's first 2D-Matryoshka embedding model. This innovative approach allows you to reduce both the number of layers and dimensions while maintaining competitive performance.
mxbai-rerank-v1Link to section
We're launching our first generation of reranking models, available in three sizes. These models add a powerful semantic layer to existing search systems, dramatically improving result relevance.
ModelsLink to section
- mxbai-rerank-xsmall-v1: Ultra-efficient for high-volume applications
- mxbai-rerank-base-v1: Balanced performance and speed
- mxbai-rerank-large-v1: Maximum accuracy for critical applications