mxbai-embed-xsmall-v1
Model Description
Mxbai-embed-xsmall-v1 is Mixedbread AI's smallest and most efficient English embedding model, specifically optimized for retrieval tasks. Despite its compact size with only 22.7 million parameters and 384 dimensions, it delivers competitive performance, making it an ideal choice for applications where computational resources are limited. It is licensed under Apache 2.0.
The model is based on all-MiniLM-L6-v2 and was fine-tuned using the AnglE loss function and Espresso to enhance its capabilities for generating high-quality embeddings, particularly for retrieval scenarios like search, recommendation systems, and [Retrieval-Augmented Generation (RAG).
On the Massive Text Embedding Benchmark (MTEB), mxbai-embed-xsmall-v1 shows improved performance over its base model on average (42.80 vs 41.56) across retrieval tasks. It also demonstrates significant gains on long context benchmarks like LoCo (avg. 76.34 vs 67.34) and LongEmb (avg. 45.94 vs 36.10) compared to all-MiniLM-L6-v2
. The small size translates to faster inference, lower resource consumption, and cost-effectiveness, especially beneficial for edge devices or large-scale deployments.
mxbai-colbert-large-v1
A state-of-the-art ColBERT model for reranking and retrieval tasks. This model combines efficient vector search with nuanced token-level matching, making it ideal for advanced information retrieval applications.
Our Reranking Models
The Mixedbread rerank family is a collection of state-of-the-art, open-source reranking models designed to significantly enhance search accuracy across various domains. These models can be seamlessly integrated into existing search systems, offering best-in-class performance and easy implementation for improved user satisfaction in search results.