mxbai-embed-2d-large-v1
Model Description
mxbai-embed-2d-large-v1 is the world's first 2D-Matryoshka embedding model. The 2D-Matryoshka model introduces a novel approach that enables you to reduce both the number of layers and the dimensions of embeddings within the model. This dual reduction strategy allows for a more compact model size while still delivering performance on par with that of leading models such as Nomic's embedding model. Specifically, reducing the model's layers by approximately 50% retains up to 85% of its original performance, even without additional training.
The model was pretrained using contrastive training on over 700 million pairs, covering a wide variety of topics across the internet. It was then fine-tuned with over 30 million high-quality triplets using novel loss functions. mxbai-embed-2d-large-v1 allows users to get multiple models out of one and use different embedding sizes, providing full control over the trade-offs between speed, storage consumption, and model performance.
On the Massive Text Embedding Benchmark (MTEB), mxbai-embed-2d-large-v1a performs at the level of current embedding models of different sizes. The model's performance remains competitive even when the embedding size is reduced by a factor of 16. Additionally, the model retains about 75% of its performance after cutting half of its layers, demonstrating the effectiveness of the 2D-Matryoshka approach.
Compare with other models
Model | Context Window | Dimensions | Price / 1M tokens |
---|---|---|---|
mxbai-embed-2d-large-v1 | 512 | 1024 | $0.10 |
mxbai-embed-large-v1 | 512 | 1024 | $0.10 |
deepset-mxbai-embed-de-large-v1 | 512 | 1024 | $0.10 |
mxbai-embed-xsmall-v1 | 4.1K | 384 | - |
mxbai-colbert-large-v1 | 512 | 1024 | - |
Last updated: July 15, 2025