Mixedbread
Ingest

File Ingestion

Upload any file format to your Vector Store and make it instantly searchable. Vector Stores automatically understand and process PDFs, images, documents, code, and more. Turning your unstructured data into intelligent search experiences.

Basic File Ingestion

Upload a single file to your Vector Store:

When you upload a file, Vector Stores automatically understand the content and make it searchable with natural language queries. The system handles all the complexity behind the scenes.

For complete details on the file object structure including all properties and status values, see .

Ingestion Options

File Types

Vector Stores understand any data format:

  • Documents: PDFs (including scanned), Word docs, text files, Markdown
  • Presentations: PowerPoint slides with images and text
  • Code: Python, JavaScript, TypeScript, Java, C#
  • Images: Photos, diagrams, charts with OCR and visual understanding
  • Specialized: Pre-structured JSON/JSONL formats

Each file type is automatically processed to understand both text and visual content, preserving context and meaning.

For the complete list of supported formats and processing details, see .

Metadata

Enhance your files with structured metadata for better organization and filtering:

Metadata is structured information about your files that enables powerful filtering, organization, and search capabilities. It's inherited by all chunks created from the file, enabling precise filtering during search operations.

Supported Metadata Types:

  • String values: Categories, names, tags, status values
  • Numeric values: Scores, versions, counts, measurements
  • Boolean values: Flags, permissions, feature toggles
  • Date/time values: Timestamps, deadlines, publishing dates
  • Array values: Multiple values, collections, tags

For supported metadata types, see .

Learn more about to search with metadata filtering.

Parsing Strategy

Control how your files are processed with parsing strategy options (see examples above):

Available Parsing Strategies:

  • fast (default): Optimized for speed, good quality for most documents
  • high_quality: Enhanced processing for complex layouts and formatting

Vector Store File Status

  • pending: File uploaded and queued for processing
  • in_progress: Active processing (extraction, chunking, embedding, indexing)
  • completed: Successfully processed and searchable
  • failed: Processing error (check last_error for details)
  • cancelled: Processing stopped manually or automatically

You can retrieve the status of any file using .

Next Steps

Now that you understand file ingestion, learn how to manage your content:

Last updated: July 15, 2025