Embedding Models

Introduction

Mixedbread embed is our flagship embedding model family. Enjoy easy access and stellar performance that can help you elevate your retrieval pipeline. Use embeddings for search, classification, recommendation, and other impactful tasks.

What's new in the Mixedbread embed family?

The mixedbread embed family has recently seen exciting developments:

Release of our German/English embedding model deepset-mxbai-embed-de-large-v1
Release of our English embedding model mxbai-embed-large-v1
Introduction of the 2D embedding model mxbai-embed-2d-large-v1

Coming Soon: We are currently working on specialized models to extend the family! Please feel free to contact us for more information.

Model Family

Here's an overview of our current model lineup:

Model	Card	Context Length	Dimension	MTEB Average
mxbai-embed-large-v1	Link	512	1024	64.68
mxbai-embed-2d-large-v1	Link	512	1024 (base)	63.25 (base)
deepset-mxbai-embed-de-large-v1	Link	512	1024	-
mxbai-embed-xsmall-v1	Link	4096	384	42.80

Why Choose Mixedbread Embeddings

The Mixedbread embed family offers several advantages:

Powerful Performance: State-of-the-art results on benchmarks
Size Efficiency: Optimized for resource utilization
Open-Source: Fully accessible and customizable
Versatility: Suitable for various NLP tasks

Performance Comparison

Our new mxbai-embed-large-v1 model outperforms other similarly sized open models and even surpasses some closed-source models on the MTEB benchmark:

Model	Avg (56 datasets)
mxbai-embed-large-v1	64.68
bge-large-en-v1.5	64.23
jina-embeddings-v2-base-en	60.38
OpenAI text-embedding-3-large (Proprietary)	64.58
Cohere embed-english-v3.0 (Proprietary)	64.47

API Benefits

While you can use our open-source models directly, our API offers additional advantages:

Enhanced Performance: API-exclusive versions offer improvements like better int8-quantization and take advantage of our optimized inference pipeline.
Calibration Data: Generated using over 50 million samples for more accurate float32 to int8 mapping
Faster Response Times: Optimized for low-latency retrieval tasks

Use Cases

Discover specific products and features you can build with Embeddings.

mxbai-embed-large-v1

Discover mxbai-embed-large-v1, our state-of-the-art English embedding model. Learn about its powerful performance, versatility across various NLP tasks, and how to effectively use it for semantic search, information retrieval, and other applications.

Last updated: July 2, 2025