Mixedbread

Unstructured

Integrate Mixedbread AI's embeddings capabilities into your projects.

Quick Start

  1. Install the package:
pip install unstructured unstructured[embed-mixedbreadai]
  1. Set up your API key:
import os
os.environ["MXBAI_API_KEY"] = "your_api_key_here"
  1. Start using Mixedbread AI in your Unstructured projects!

Embeddings

Generate text embeddings for queries and documents.

import os
from unstructured.embed.mixedbreadai import (
    MixedbreadAIEmbeddingConfig,
    MixedbreadAIEmbeddingEncoder,
)
from unstructured.documents.elements import Text

embedding_encoder = MixedbreadAIEmbeddingEncoder(
    config=MixedbreadAIEmbeddingConfig(
        api_key=os.getenv("MXBAI_API_KEY"),
        model_name="mixedbread-ai/mxbai-embed-large-v1",
    )
)

elements = embedding_encoder.embed_documents(
    elements=[Text("Bread is life"), Text("Bread is love")]
)

query = "Represent this sentence for searching relevant passages: What is bread?"
query_embedding = embedding_encoder.embed_query(query)

[print(element.embedding) for element in elements]
print(query_embedding)
print(embedding_encoder.is_unit_vector, embedding_encoder.num_of_dimensions)

Advanced Usage

Custom Embedding Parameters

You can customize the embedding process with additional parameters:

import os
from unstructured.embed.mixedbreadai import (
    MixedbreadAIEmbeddingConfig,
    MixedbreadAIEmbeddingEncoder,
)
from unstructured.documents.elements import Text

embeddings = MixedbreadAIEmbeddingEncoder(
    config=MixedbreadAIEmbeddingConfig(
        api_key=os.getenv("MXBAI_API_KEY"),
        model_name="mixedbread-ai/mxbai-embed-large-v1",
        batch_size=64,
        normalized=True,
        dimensions=512,
        encoding_format="binary"
    )
)

elements = embedding_encoder.embed_documents(
    elements=[Text("Bread is life"), Text("Bread is love")]
)

query = "Represent this sentence for searching relevant passages: What is bread?"
query_embedding = embedding_encoder.embed_query(query)

[print(element.embedding) for element in elements]
print(query_embedding)
print(embedding_encoder.is_unit_vector, embedding_encoder.num_of_dimensions)

Documentation

For detailed information on using Mixedbread AI with Unstructured, check out:

Need Help?

Happy baking with Mixedbread AI and Unstructured! ๐Ÿž๐Ÿš€

Last updated: July 2, 2025