Mixedbread
Ingest

Supported File Types

Stores natively understand a wide range of file formats through multimodal AI. No extraction needed, the system directly understands text, images, and complex layouts for semantic search.

Documents

FormatExtensions
PDF.pdf
Word.doc, .docx, .dotx, .docm, .dotm
OpenDocument Text.odt
Rich Text Format.rtf
Text.txt
Markdown.md, .markdown, .mdx

Presentations

FormatExtensions
PowerPoint.ppt, .pptx
PowerPoint Slideshow.ppsx
PowerPoint Add-in.ppam
PowerPoint Macro-Enabled.pptm, .potm, .ppsm
OpenDocument Presentation.odp

Code

LanguageExtensions
Python.py
JavaScript.js
TypeScript.ts
Java.java
C#.cs

Images

FormatExtensions
JPEG.jpg, .jpeg
PNG.png
WebP.webp

Specialized Formats

FormatExtensionsNotes
Mixedbread JSON.mxjson
Mixedbread JSONL.mxjsonl

The allows direct ingestion of pre-chunked content. Use it when you have custom chunking logic, need to preserve specific chunk boundaries, or want to include pre-computed metadata like OCR text or transcriptions.

Coming Soon

Audio Files: .mp3, .wav, .flac, .ogg, .m4a

Video Files: .mp4, .avi, .mov, .wmv, .webm

Last updated: January 7, 2026