Stores natively understand a wide range of file formats through multimodal AI. No extraction needed, the system directly understands text, images, and complex layouts for semantic search.
The Mixedbread JSON format allows direct ingestion of pre-chunked content. Use it when you have custom chunking logic, need to preserve specific chunk boundaries, or want to include pre-computed metadata like OCR text or transcriptions.