Mixedbread

Data Models

Understanding the core data structures in Mixedbread Vector Stores helps you work effectively with the API and understand how your content is organized and retrieved.

Vector Store

A Vector Store is the primary container for your searchable content. It holds your files, manages access permissions, and provides the foundation for semantic search operations.

Vector Store Properties

PropertyTypeDescription
idstringUnique identifier for the Vector Store
namestringUser-defined name that serves as an identifier
descriptionstringOptional description of the Vector Store's purpose
is_publicbooleanWhether the Vector Store is publicly accessible
metadataobjectAdditional metadata associated with the Vector Store
file_countsobjectCounts of files in different processing states
expires_afterobjectExpiration configuration based on activity
statusenumCurrent status: expired, in_progress, completed
created_atstringISO timestamp when the Vector Store was created
updated_atstringISO timestamp when the Vector Store was last updated
last_active_atstringISO timestamp of the last activity
usage_bytesintegerTotal storage space used by indexed content
expires_atstringComputed expiration timestamp (if expires_after is set)
objectstringAlways "vector_store"

File Counts Object

The file_counts object provides detailed breakdown of file processing states:

PropertyTypeDescription
pendingintegerNumber of files waiting to be processed
in_progressintegerNumber of files currently being processed
cancelledintegerNumber of files whose processing was cancelled
completedintegerNumber of successfully processed files
failedintegerNumber of files that failed processing
totalintegerTotal number of files

For detailed configuration options including expiration policies and public access, see .

Vector Store Example

{
  "id": "vs_xyz789",
  "name": "product-documentation",
  "description": "Complete product documentation and API reference",
  "is_public": false,
  "metadata": {
    "category": "documentation",
    "language": "en"
  },
  "file_counts": {
    "pending": 2,
    "in_progress": 1,
    "cancelled": 0,
    "completed": 10,
    "failed": 0,
    "total": 13
  },
  "expires_after": {
    "anchor": "last_active_at",
    "days": 30
  },
  "status": "in_progress",
  "created_at": "2024-01-15T10:00:00Z",
  "updated_at": "2024-01-20T14:30:00Z",
  "last_active_at": "2024-01-20T14:30:00Z",
  "usage_bytes": 1048576,
  "expires_at": "2024-02-19T14:30:00Z",
  "object": "vector_store"
}

Vector Store File

A Vector Store File represents a complete file that you've uploaded to a Vector Store. It tracks the file's processing status, metadata, and relationship to the searchable chunks created from its content.

File Properties

PropertyTypeDescription
idstringUnique identifier for the file within the Vector Store
filenamestringOriginal name of the uploaded file
metadataobjectCustom key-value pairs you've attached to the file
statusenumCurrent processing status of the file
last_errorobjectDetails about any processing errors that occurred
vector_store_idstringID of the Vector Store containing this file
created_atstringISO timestamp when the file was added to the Vector Store
versionintegerVersion number of the file within the Vector Store
usage_bytesintegerStorage space used by the file's indexed data
objectstringAlways "vector_store.file"

For detailed information on file processing lifecycle and status meanings, see .

For guidance on metadata structure and types, see .

File Example

{
  "id": "file_abc123",
  "filename": "product-documentation.pdf",
  "metadata": {
    "category": "documentation",
    "department": "product",
    "version": "2.1",
    "last_updated": "2024-01-15"
  },
  "status": "completed",
  "last_error": null,
  "vector_store_id": "vs_xyz789",
  "created_at": "2024-01-15T10:30:00Z",
  "version": 1,
  "usage_bytes": 245760,
  "object": "vector_store.file"
}

Vector Store Chunk

A Vector Store Chunk represents a searchable segment of content created from a Vector Store File. When you search, you get back chunks that contain the most relevant portions of your files.

Chunk Properties

PropertyTypeDescription
chunk_indexintegerPosition of this chunk within the source file
mime_typestringContent type of the chunk (text/plain, image/png, etc.)
modelstringModel used to generate the chunk's vector
scorenumberRelevance score for this chunk (in search results)
file_idstringID of the file this chunk came from
filenamestringName of the source file
vector_store_idstringID of the Vector Store containing this chunk
metadataobjectMetadata inherited from the source file
typeenumType of content: text, image_url, audio_url, video_url

Content-Specific Properties

Text Chunks

PropertyTypeDescription
textstringText content of the chunk

Image Chunks

PropertyTypeDescription
image_urlobjectImage URL and format information
ocr_textstringText extracted from images via OCR
summarystringAI-generated summary of the image content

Audio Chunks

PropertyTypeDescription
audio_urlobjectAudio URL and format information
transcriptionstringSpeech-to-text transcription of the audio
summarystringAI-generated summary of the audio content

Video Chunks

PropertyTypeDescription
video_urlobjectVideo URL and format information
transcriptionstringSpeech-to-text transcription of the video
summarystringAI-generated summary of the video content

Chunk Types

Text Chunks

{
  "type": "text",
  "text": "User authentication in our API requires a valid API key...",
  "chunk_index": 2,
  "mime_type": "text/plain",
  "score": 0.89
}

Image Chunks

{
  "type": "image_url",
  "image_url": {
    "url": "https://signed-url-to-image.com/chunk_img_123",
    "format": "png"
  },
  "ocr_text": "Figure 1: Authentication Flow Diagram",
  "summary": "A diagram showing the authentication flow process",
  "chunk_index": 5,
  "mime_type": "image/png",
  "score": 0.76
}

Audio Chunks

{
  "type": "audio_url",
  "audio_url": {
    "url": "https://signed-url-to-audio.com/chunk_audio_456"
  },
  "transcription": "Welcome to our product overview. In this section, we'll cover...",
  "summary": "Product overview introduction discussing key features",
  "chunk_index": 3,
  "mime_type": "audio/mpeg",
  "score": 0.82
}

Video Chunks

{
  "type": "video_url",
  "video_url": {
    "url": "https://signed-url-to-video.com/chunk_video_789"
  },
  "transcription": "Hello everyone, today we're going to demonstrate...",
  "summary": "Product demonstration video showing core functionality",
  "chunk_index": 1,
  "mime_type": "video/mp4",
  "score": 0.88
}

Complete Chunk Example

{
  "chunk_index": 3,
  "mime_type": "text/plain",
  "model": "mixedbread-ai/mxbai-omni-v1",
  "score": 0.92,
  "file_id": "file_abc123",
  "filename": "product-documentation.pdf",
  "vector_store_id": "vs_xyz789",
  "metadata": {
    "category": "documentation",
    "department": "product"
  },
  "type": "text",
  "text": "To authenticate API requests, include your API key in the Authorization header: Authorization: Bearer YOUR_API_KEY. The API key identifies your account and provides access to your organization's resources."
}

Next Steps

Now that you understand the data models, explore how they're used:

Last updated: July 14, 2025