Mixedbread

mxbai-embed-xsmall-v1

Explore mxbai-embed-xsmall-v1, Mixedbread AI's smallest and most efficient English embedding model optimized for retrieval. Discover its competitive performance, long context support and capabilities in resource-constrained applications.

Parameters 22.7M
Context Window 4.1K
Price / 1M tokens $0.00
Languages EN

Model Description

Mxbai-embed-xsmall-v1 is Mixedbread AI's smallest and most efficient English embedding model, specifically optimized for retrieval tasks. Despite its compact size with only 22.7 million parameters and 384 dimensions, it delivers competitive performance, making it an ideal choice for applications where computational resources are limited. It is licensed under Apache 2.0.

The model is based on all-MiniLM-L6-v2 and was fine-tuned using the AnglE loss function and Espresso to enhance its capabilities for generating high-quality embeddings, particularly for retrieval scenarios like search, recommendation systems, and .

On the Massive Text Embedding Benchmark (MTEB), mxbai-embed-xsmall-v1 shows improved performance over its base model on average (42.80 vs 41.56) across retrieval tasks. It also demonstrates significant gains on long context benchmarks like LoCo (avg. 76.34 vs 67.34) and LongEmb (avg. 45.94 vs 36.10) compared to all-MiniLM-L6-v2. The small size translates to faster inference, lower resource consumption, and cost-effectiveness, especially beneficial for edge devices or large-scale deployments.

Compare with other models

ModelContext WindowDimensionsInput Price (/1M tokens)
mxbai embed xsmall v14.1K 384$0.00
mxbai Embed Large v1512 1024$0.00
deepset mxbai embed german large v1512 1024$0.00
mxbai embed 2d large v1512 1024$0.00
mxbai colbert large v1512 1024$0.00

Calculate Sentence Similarities

The following code illustrates how to compute similarities between sentences using the cosine similarity score function.

import torch
from mixedbread import Mixedbread
from sentence_transformers.util import semantic_search

mxbai = Mixedbread(api_key="YOUR_API_KEY")
model = "mixedbread-ai/mxbai-embed-xsmall-v1"

docs = [
    "A man is eating food.",
    "A man is eating pasta.",
]

res = mxbai.embed(
    model=model,
    input=docs,
    normalized=True,
    encoding_format='float'
)

embeddings_list = [item.embedding for item in res.data]

query_tensor = torch.tensor([embeddings_list[0]])
corpus_tensor = torch.tensor([embeddings_list[1]])

hits = semantic_search(query_tensor, corpus_tensor, top_k=1)

similarity_score = 0.0
if hits and hits[0]:
    similarity_score = hits[0][0]['score']

print(f"Similarity (using semantic_search): {similarity_score:.4f}")

Last updated: May 6, 2025