Ingest
Supported File Types
Stores natively understand a wide range of file formats through multimodal AI. No extraction needed, the system directly understands text, images, and complex layouts for semantic search.
Documents
| Format | Extensions |
|---|---|
.pdf | |
| Word | .doc, .docx, .dotx, .docm, .dotm |
| OpenDocument Text | .odt |
| Rich Text Format | .rtf |
| Text | .txt |
| Markdown | .md, .markdown, .mdx |
Presentations
| Format | Extensions |
|---|---|
| PowerPoint | .ppt, .pptx |
| PowerPoint Slideshow | .ppsx |
| PowerPoint Add-in | .ppam |
| PowerPoint Macro-Enabled | .pptm, .potm, .ppsm |
| OpenDocument Presentation | .odp |
Code
| Language | Extensions |
|---|---|
| Python | .py |
| JavaScript | .js |
| TypeScript | .ts |
| Java | .java |
| C# | .cs |
Images
| Format | Extensions |
|---|---|
| JPEG | .jpg, .jpeg |
| PNG | .png |
| WebP | .webp |
Specialized Formats
| Format | Extensions | Notes |
|---|---|---|
| Mixedbread JSON | .mxjson | Pre-chunked content format |
| Mixedbread JSONL | .mxjsonl | Pre-chunked content format |
The Mixedbread JSON format allows direct ingestion of pre-chunked content. Use it when you have custom chunking logic, need to preserve specific chunk boundaries, or want to include pre-computed metadata like OCR text or transcriptions.
Coming Soon
Audio Files: .mp3, .wav, .flac, .ogg, .m4a
Video Files: .mp4, .avi, .mov, .wmv, .webm
Last updated: January 7, 2026