Overview
Introduction
The Mixedbread Parsing API is your essential tool for transforming complex documents into clean text. But it goes beyond simple text extraction. It understands the layout and returns detailed information about various layout elements. You receive the content, element information and even bounding boxes. All from a single API call.
Typical Workflow: Parsing a Document
Initiate a parsing job by providing your document.
Retrieve the parsed results from the job once it's complete.
Key Features
- Multi-Format Support: Handles various document types including PDF, PPTX, HTML, and more.
- Layout-Aware Extraction: Understands document structure beyond raw text.
- Structured Output: Provides detailed information about content elements.
- Multiple Output Formats: Choose from JSON, Markdown, or clean Text based on your needs.
- Asynchronous Processing: Efficiently handle large or complex documents.
- Improves Downstream Quality: Creates better input for embedding, RAG, and indexing.
Check out the Parsing API for detailed endpoints and code examples.
Reranking
Leverage Mixedbread's Reranking API to access state-of-the-art models that re-order search results or candidate lists based on deep semantic relevance. Improve the precision of search, RAG, and recommendation systems.
Use Cases
Discover how the Mixedbread Parsing API enables key applications like optimizing data for RAG, structured data extraction, powering document understanding pipelines, and content migration by leveraging layout-aware document analysis.
Last updated: August 18, 2025