Mixedbread
Ingest

File Ingestion

Upload files to your Stores to make them instantly searchable. Stores support a variety of file formats and languages—you just upload your data and we take care of the optimal processing. Let's start with the upload method.

Basic File Ingestion

Upload a single file to your Store:

For complete details on the file object structure including all properties and status values, see .

Ingestion Options

External ID

You can optionally assign a unique external ID to the file. This identifier lets you reference the file later and makes it easy to replace or update the same file by using the same external ID. The ID can be any string value, though we recommend using a relative file path to mirror your repository's structure.

Overwrite Behavior: By default, uploading a file with an existing external_id will overwrite the previous file. You can control this behavior with the overwrite parameter:

  • overwrite: true (default): Replaces the existing file with the same external_id
  • overwrite: false: Prevents overwriting and returns an error if the external_id already exists

This makes it easy to update files in your Store while preventing accidental overwrites when needed.

File Types

Stores understand any data format:

  • Documents: PDFs (including scanned), Word docs, text files, Markdown
  • Presentations: PowerPoint slides with images and text
  • Code: Python, JavaScript, TypeScript, Java, C#
  • Images: Photos, diagrams, charts with OCR and visual understanding
  • Specialized: Pre-structured JSON/JSONL formats

Each file type is automatically processed to understand both text and visual content, preserving context and meaning.

For the complete list of supported formats and processing details, see .

Metadata

Enhance your files with structured metadata for better organization and filtering:

Metadata is structured information about your files that enables powerful filtering, organization, and search capabilities. It's inherited by all chunks created from the file, enabling precise filtering during search operations.

Supported Metadata Types:

  • String values: Categories, names, tags, status values
  • Numeric values: Scores, versions, counts, measurements
  • Boolean values: Flags, permissions, feature toggles
  • Date/time values: Timestamps, deadlines, publishing dates
  • Array values: Multiple values, collections, tags

For supported metadata types, see .

Learn more about to search with metadata filtering.

Parsing Strategy

Control how your files are processed with parsing strategy options (see examples above):

Available Parsing Strategies:

  • fast (default): Optimized for speed, good quality for most documents
  • high_quality: Enhanced processing for complex layouts and formatting

Upload Helpers

The SDK provides two convenient helpers for file ingestion:

  • upload: Uploads the file and immediately returns the created Store File. The file's status begins as pending and progresses to in_progress and completed in the background. Use this when you want to upload files asynchonisly and as fast as possible
  • uploadAndPoll / upload_and_poll: Uploads the file and polls until processing completes. Use this when you want to proceed only after complete processing. For example if the upload is directly followed by a search method.

To monitor progress without blocking, retrieve the file by ID and check its status using .

Store File Status

  • pending: File uploaded and queued for processing
  • in_progress: Active processing (extraction, chunking, embedding, indexing)
  • completed: Successfully processed and searchable
  • failed: Processing error (check last_error for details)
  • cancelled: Processing stopped manually or automatically

You can retrieve the status of any file using .

Last updated: November 18, 2025