deepset-mxbai-embed-de-large-v1
Discover deepset-mxbai-embed-de-large-v1, a powerful German/English embedding model developed through collaboration between deepset and Mixedbread. This state-of-the-art open-source model offers superior performance, supports binary quantization and Matryoshka representation learning, and enables significant cost reductions in real-world applications.
Model Description
deepset-mxbai-embed-de-large-v1 is a powerful German/English embedding model developed through collaboration between deepset and Mixedbread. It sets a new performance standard among open-source embedding models, outperforming domain-specific alternatives in real-world applications.
The model was initialized from the multilingual-e5-large model and fine-tuned on over 30 million pairs of high-quality German data using the AnglE loss function. This extensive training enables the model to adapt to a wide range of topics and domains, making it suitable for various real-world applications and Retrieval-Augmented Generation (RAG) use cases.
deepset-mxbai-embed-de-large-v1 supports both binary quantization and Matryoshka representation learning (MRL). This allows for significant reductions in storage and infrastructure costs, with the potential for 97%+ cost savings through binary MRL.
The model achieves top performance on various benchmarks, including private and public datasets created in collaboration with deepset's clients. It demonstrates strong performance across diverse tasks, showcasing its versatility and robustness.
Compare with other models
Model | Context Window | Dimensions | Input Price (/1M tokens) |
---|---|---|---|
deepset mxbai embed german large v1 | 512 | 1024 | $0.00 |
mxbai Embed Large v1 | 512 | 1024 | $0.00 |
mxbai embed 2d large v1 | 512 | 1024 | $0.00 |
mxbai embed xsmall v1 | 4.1K | 384 | $0.00 |
mxbai colbert large v1 | 512 | 1024 | $0.00 |
Semantic Search
The following code illustrates how to retrieve relevant passages for a given query using semantic_search
.
Last updated: May 7, 2025