Metadata Filtering
Metadata filtering provides a powerful way to narrow down results based on the metadata attached to your files.
Need to understand metadata types? This page covers filtering syntax and operations. For supported metadata types and structure, see Metadata Types.
Quick Example
Here's a simple example of filtering files by category:
from mixedbread import Mixedbread
mxbai = Mixedbread(api_key="YOUR_API_KEY")
response = mxbai.stores.files.list(
store_identifier="my-knowledge-base",
limit=10,
metadata_filter={"key": "category", "value": "documentation", "operator": "eq"},
)
for file in response.data:
print(file)Filter Structure
Filters can be structured in two ways depending on your needs:
Single Field Filter (Direct Condition)
For simple single-field filtering, you can use a direct condition:
Example:
Multiple Field Filter (Logical Operators)
For complex filtering with multiple conditions, use logical operators:
Example filter structure:
Generated Metadata Fields
You can target auto-generated chunk metadata by prefixing the key with generated_metadata..
This works the same way as regular metadata filters and is especially useful for filtering on
values described in Generated Metadata.
Use dot notation to drill into nested structures, e.g. generated_metadata.chunk_headings.level.
Logical Operators
Combine multiple conditions using logical operators to create sophisticated filters:
All (AND Operation)
All conditions must be true:
Any (OR Operation)
At least one condition must be true:
None (NOT Operation)
None of the conditions should be true:
Comparison Operators
Equality and Comparison Operators
Data Type Filtering
String Values
Case-sensitive by default - ensure consistent casing in your metadata:
Numeric Values
Support integer and float comparisons:
Boolean Values
Support true/false conditions:
Date Values
Recommend ISO 8601 format:
Array/List Values
Support membership filtering:
Combined Logical Operations
Nested Conditions
Complex multi-level filtering example:
Advanced Filtering Example
Here's a practical example demonstrating complex nested filters:
from mixedbread import Mixedbread
mxbai = Mixedbread(api_key="YOUR_API_KEY")
metadata_filter = {
"all": [
{"key": "status", "value": "published", "operator": "eq"},
{
"any": [
{"key": "priority", "value": 3, "operator": "gte"},
{
"all": [
{"key": "category", "value": "important", "operator": "eq"},
{"key": "reviewed", "value": True, "operator": "eq"},
]
},
]
},
]
}
response = mxbai.stores.files.list(
store_identifier="my-knowledge-base",
limit=10,
metadata_filter=metadata_filter,
)
for file in response.data:
print(file)