Mixedbread
Store Files

Manage Files

Once files are uploaded to your Store, you can inspect and manage them using these core operations. All operations work with either the file ID or the file's external ID (if provided during upload).

Retrieve Store FileLink to section

Get detailed information about a specific file using either its ID or external ID:

Retrieve Store File Metadata
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

file = mxbai.stores.files.retrieve(
    store_identifier="my-knowledge-base",
    file_identifier="f47ac10b-58cc-4372-a567-0e02b2c3d479",
)

print(file)

By default, retrieve returns the file object and its file-level metadata. The chunks field is null unless you explicitly request chunks with return_chunks.

The response includes processing status, metadata, usage statistics, and error details if applicable.

For complete details on file object properties, see .

Retrieve File ChunksLink to section

Use return_chunks when you want the parsed, searchable representation of a file instead of only the file-level metadata.

Retrieve All Chunks from a File
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

file = mxbai.stores.files.retrieve(
    store_identifier="my-knowledge-base",
    file_identifier="f47ac10b-58cc-4372-a567-0e02b2c3d479",
    return_chunks=True,
)

for chunk in file.chunks or []:
    print(chunk.chunk_index, chunk.type)

Use this when you want to inspect parsed text, OCR output, transcriptions, or chunk-level generated_metadata.

For complete details on chunk fields and chunk types, see .

Retrieve Specific Chunks by IndexLink to section

return_chunks also accepts a list of chunk indices. This is useful when you want a small, exact slice of a file instead of every chunk.

Retrieve Specific Chunks by Index
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

file = mxbai.stores.files.retrieve(
    store_identifier="my-knowledge-base",
    file_identifier="f47ac10b-58cc-4372-a567-0e02b2c3d479",
    return_chunks=[0, 3, 7],
)

for chunk in file.chunks or []:
    print(chunk.chunk_index)

Chunk indices are zero-based and correspond to the file's parsed chunk order.

Search results include both file_id and chunk_index, so you can use them to load the exact source chunk that matched a query.

Hydrate a Search Result Back to Its Source Chunk
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

results = mxbai.stores.search(
    query="authentication timeout",
    store_identifiers=["my-knowledge-base"],
    top_k=1,
)

match = results.data[0]
file = mxbai.stores.files.retrieve(
    store_identifier="my-knowledge-base",
    file_identifier=match.file_id,
    return_chunks=[match.chunk_index],
)

chunk = file.chunks[0]
print(chunk)

This is the easiest way to go from a semantic search hit back to the precise chunk in the original file.

File Status and AvailabilityLink to section

To reliably inspect chunks, wait until the file reaches completed status.

  • pending: The file was accepted and queued for processing
  • in_progress: Parsing, chunking, embedding, and indexing are still running
  • completed: Chunks are ready to inspect and search
  • failed: Processing failed; inspect last_error for details
  • cancelled: Processing stopped before completion

If you need a file to be ready before continuing, use uploadAndPoll / upload_and_poll during ingestion or poll retrieve until the status becomes completed.

List Store FilesLink to section

View all files in your Store. The list operation uses cursor-based pagination:

List Store Files
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

response = mxbai.stores.files.list("my-knowledge-base", limit=20)

for file in response.data:
    print(file)

Metadata FilteringLink to section

Filter Store Files based on their metadata.

List Files with Metadata Filter
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

response = mxbai.stores.files.list(
    store_identifier="my-knowledge-base",
    limit=10,
    metadata_filter={
        "key": "category",
        "value": "documentation",
        "operator": "eq",
    },
)

for file in response.data:
    print(file)

Paginate and Filter FilesLink to section

List all available files and filter them by status. This operation combines cursor-based pagination with the status filter to retrieve only the subsets of files you care about. For a complete explanation of cursor-based pagination options, see the .

Paginate Files with Status Filter
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

all_files = []

files = mxbai.stores.files.list(
    store_identifier="my-knowledge-base",
    limit=100,
    # change status to get different subsets
    statuses=["failed"],
)

all_files += files.data

while files.pagination.has_more:
    files = mxbai.stores.files.list(
        store_identifier="my-knowledge-base",
        limit=100,
        after=files.pagination.last_cursor,
        statuses=["failed"],
    )
    all_files += files.data

print(len(all_files), "files matched")

Delete Store FileLink to section

Remove files from your Store:

Delete Store File
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

response = mxbai.stores.files.delete(
    store_identifier="my-knowledge-base",
    file_identifier="f47ac10b-58cc-4372-a567-0e02b2c3d479",
)

print(response)

Important: Deleting a file permanently removes:

  • The original file from storage
  • All generated chunks and embeddings
  • Associated metadata and search indexes
  • Processing history and logs

If you want to replace a file instead of deleting it first, upload new content with the same external_id. By default, uploads with the same external_id overwrite the previous version. For details, see .

Last updated: April 24, 2026