Overview

Introduction

The Mixedbread Parsing API is your essential tool for transforming complex documents into clean text. But it goes beyond simple text extraction. It understands the layout and returns detailed information about various layout elements. You receive the content, element information and even bounding boxes. All from a single API call.

Typical Workflow: Parsing a Document

Initiate a parsing job by providing your document.

Retrieve the parsed results from the job once it's complete.

Key Features

Multi-Format Support: Handles various document types including PDF, PPTX, HTML, and more.
Layout-Aware Extraction: Understands document structure beyond raw text.
Structured Output: Provides detailed information about content elements.
Multiple Output Formats: Choose from JSON, Markdown, or clean Text based on your needs.
Asynchronous Processing: Efficiently handle large or complex documents.
Improves Downstream Quality: Creates better input for embedding, RAG, and indexing.

Check out the Parsing API for detailed endpoints and code examples.

mxbai-rerank-xsmall-v1

mxbai-rerank-xsmall-v1 is a highly efficient, open-source reranking model designed to enhance search results with semantic relevance. As the most compact model in the Mixedbread rerank family, it offers a balance of good performance and minimal resource requirements, making it ideal for improving search systems with minimal infrastructure changes.

Use Cases

Discover how the Mixedbread Parsing API enables key applications like optimizing data for RAG, structured data extraction, powering document understanding pipelines, and content migration by leveraging layout-aware document analysis.

Last updated: June 25, 2025

Overview

Introduction

Typical Workflow: Parsing a Document

Key Features

On this page