mxbai-embed-large-v1
Model Description
mxbai-embed-large-v1 is our powerful English embedding model that provides state-of-the-art performance among efficiently sized models. It outperforms closed source models like OpenAI's text-embedding-ada-002.
The model was trained on a vast dataset of over 700 million pairs using contrastive training and fine-tuned on more than 30 million high-quality triplets using the AnglE loss function. This extensive training enables the model to adapt to a wide range of topics and domains, making it suitable for various real-world applications and [Retrieval-Augmented Generation (RAG) use cases.
mxbai-embed-large-v1 is well-suited for binary embeddings. This helps you save 32x storage and achieve 40x faster retrieval, while maintaining over 96% of the performance.
mxbai-embed-large-v1 achieves top performance on the Massive Text Embedding Benchmark (MTEB), which measures embedding models across seven tasks: classification, clustering, pair classification, re-ranking, retrieval, semantic textual similarity, and summarization. The model's strong performance across these diverse tasks demonstrates its versatility and robustness.
Compare with other models
Model | Context Window | Dimensions | Price / 1M tokens |
---|---|---|---|
mxbai-embed-large-v1 | 512 | 1024 | $0.10 |
deepset-mxbai-embed-de-large-v1 | 512 | 1024 | $0.10 |
mxbai-embed-2d-large-v1 | 512 | 1024 | $0.10 |
mxbai-embed-xsmall-v1 | 4.1K | 384 | - |
mxbai-colbert-large-v1 | 512 | 1024 | - |
Last updated: July 15, 2025