File Ingestion
Upload files to your Stores to make them instantly searchable. Stores support a variety of file formats and languages—you just upload your data and we take care of the optimal processing. Let's start with the upload method.
Basic File Ingestion
Upload a single file to your Store:
For complete details on the file object structure including all properties and status values, see Data Models.
Ingestion Options
External ID
You can optionally assign a unique external ID to the file. This identifier lets you reference the file later and makes it easy to replace or update the same file by using the same external ID. The ID can be any string value, though we recommend using a relative file path to mirror your repository's structure.
Overwrite Behavior: By default, uploading a file with an existing
external_id will overwrite the previous file. You can control this behavior
with the overwrite parameter:
overwrite: true(default): Replaces the existing file with the sameexternal_idoverwrite: false: Prevents overwriting and returns an error if theexternal_idalready exists
This makes it easy to update files in your Store while preventing accidental overwrites when needed.
File Types
Stores understand any data format:
- Documents: PDFs (including scanned), Word docs, text files, Markdown
- Presentations: PowerPoint slides with images and text
- Code: Python, JavaScript, TypeScript, Java, C#
- Images: Photos, diagrams, charts with OCR and visual understanding
- Specialized: Pre-structured JSON/JSONL formats
Each file type is automatically processed to understand both text and visual content, preserving context and meaning.
For the complete list of supported formats and processing details, see Supported File Types.
Metadata
Enhance your files with structured metadata for better organization and filtering:
Metadata is structured information about your files that enables powerful filtering, organization, and search capabilities. It's inherited by all chunks created from the file, enabling precise filtering during search operations.
Supported Metadata Types:
- String values: Categories, names, tags, status values
- Numeric values: Scores, versions, counts, measurements
- Boolean values: Flags, permissions, feature toggles
- Date/time values: Timestamps, deadlines, publishing dates
- Array values: Multiple values, collections, tags
For supported metadata types, see Supported Metadata Types.
Learn more about Metadata Filtering to search with metadata filtering.
Parsing Strategy
Control how your files are processed with parsing strategy options (see examples above):
Available Parsing Strategies:
fast(default): Optimized for speed, good quality for most documentshigh_quality: Enhanced processing for complex layouts and formatting
Upload Helpers
The SDK provides two convenient helpers for file ingestion:
upload: Uploads the file and immediately returns the created Store File. The file'sstatusbegins aspendingand progresses toin_progressandcompletedin the background. Use this when you want to upload files asynchonisly and as fast as possibleuploadAndPoll/upload_and_poll: Uploads the file and polls until processing completes. Use this when you want to proceed only after complete processing. For example if the upload is directly followed by a search method.
To monitor progress without blocking, retrieve the file by ID and check its status using Retrieve Store File.
Store File Status
pending: File uploaded and queued for processingin_progress: Active processing (extraction, chunking, embedding, indexing)completed: Successfully processed and searchablefailed: Processing error (checklast_errorfor details)cancelled: Processing stopped manually or automatically
You can retrieve the status of any file using Retrieve Store File.