ChromaDB is an open-source vector database designed for AI applications. It stores embeddings locally, runs without a server, and integrates with any Python project in minutes. This tutorial covers everything from a first collection to metadata filtering and LLM integration.
What Is ChromaDB?
A vector database stores data as high-dimensional vectors (embeddings) and lets you search by semantic similarity — not exact keyword match. You can ask “what documents are about machine learning?” and get relevant results even if none of them contain that phrase.
ChromaDB stands out because:
- No server needed — runs in-process or as a local server
- Persistent storage — data survives restarts
- Built-in embeddings — uses
sentence-transformersout of the box - Simple API — add, query, update, delete in a few lines
Installation
pip install chromadbFor embedding support (optional, used by default embedding function):
pip install sentence-transformersYour First Collection
A collection is a named group of vectors — like a table in a relational database. Here’s the full create → add → query cycle:
import chromadb
# In-memory client (data lost on restart)
client = chromadb.Client()
# Create a collection
collection = client.create_collection("my_docs")
# Add documents
collection.add(
documents=[
"Python is a high-level programming language.",
"ChromaDB is an open-source vector database.",
"Machine learning models require large datasets.",
"Docker containers package apps with their dependencies.",
"Claude is an AI assistant built by Anthropic.",
],
ids=["doc1", "doc2", "doc3", "doc4", "doc5"],
)
# Query by semantic similarity
results = collection.query(
query_texts=["What is a vector database?"],
n_results=2,
)
for doc, dist in zip(results["documents"][0], results["distances"][0]):
print(f"[{dist:.3f}] {doc}")Output:
[0.312] ChromaDB is an open-source vector database.
[0.578] Python is a high-level programming language.Persistent Storage
Use PersistentClient to keep data between runs. ChromaDB writes to disk automatically:
import chromadb
client = chromadb.PersistentClient(path="./chroma_db")
# Data persists across restarts
collection = client.get_or_create_collection("my_docs")
collection.add(
documents=["This data will survive a restart."],
ids=["persistent_doc1"],
)
print(f"Collection has {collection.count()} documents")Embeddings: Default vs Custom
By default, ChromaDB uses all-MiniLM-L6-v2 from sentence-transformers. You can override it with any embedding function:
Sentence Transformers (local, free)
from chromadb.utils import embedding_functions
embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2" # fast, 384-dim
)
collection = client.get_or_create_collection(
"docs",
embedding_function=embed_fn,
)OpenAI Embeddings
from chromadb.utils import embedding_functions
embed_fn = embedding_functions.OpenAIEmbeddingFunction(
api_key="sk-...",
model_name="text-embedding-3-small",
)
collection = client.get_or_create_collection(
"docs_openai",
embedding_function=embed_fn,
)Use the same embedding function every time you open the same collection — mixing models breaks similarity search.
Metadata and Filtering
Each document can carry a metadata dict. Use it to filter results without scanning every vector:
collection.add(
documents=[
"GPT-4 is a large language model by OpenAI.",
"Claude Sonnet is fast and affordable.",
"Llama 3 is an open-source model by Meta.",
"Gemini Pro is Google's flagship model.",
],
ids=["gpt4", "claude", "llama3", "gemini"],
metadatas=[
{"company": "OpenAI", "open_source": False},
{"company": "Anthropic", "open_source": False},
{"company": "Meta", "open_source": True},
{"company": "Google", "open_source": False},
],
)
# Only search open-source models
results = collection.query(
query_texts=["Which models are available for free?"],
n_results=2,
where={"open_source": True},
)
print(results["documents"][0])
# ['Llama 3 is an open-source model by Meta.']Filter Operators
# Exact match
where={"company": "Anthropic"}
# Not equal
where={"company": {"$ne": "OpenAI"}}
# In a list
where={"company": {"$in": ["Anthropic", "Meta"]}}
# Combine with $and / $or
where={"$and": [{"open_source": True}, {"company": {"$ne": "Meta"}}]}Getting, Updating, and Deleting
# Get by ID
result = collection.get(ids=["doc1", "doc2"])
print(result["documents"])
# Update a document (replaces embedding + metadata)
collection.update(
ids=["doc1"],
documents=["Python is a versatile, high-level programming language."],
metadatas=[{"updated": True}],
)
# Delete by ID
collection.delete(ids=["doc4"])
# Delete by metadata filter
collection.delete(where={"company": "OpenAI"})
# Count remaining
print(f"Documents: {collection.count()}")Full-Text Search on Metadata
ChromaDB also supports exact-match filtering on document text via where_document:
results = collection.query(
query_texts=["AI assistant"],
n_results=3,
where_document={"$contains": "Anthropic"},
)Working with Real Documents
A practical loader that chunks text files and PDFs before indexing:
from pathlib import Path
import chromadb
from chromadb.utils import embedding_functions
def chunk_text(text: str, size: int = 400, overlap: int = 40) -> list[str]:
words = text.split()
chunks, i = [], 0
while i < len(words):
chunks.append(" ".join(words[i : i + size]))
i += size - overlap
return chunks
def index_file(path: str, collection) -> int:
text = Path(path).read_text(encoding="utf-8")
chunks = chunk_text(text)
name = Path(path).name
collection.add(
documents=chunks,
ids=[f"{name}_chunk_{j}" for j in range(len(chunks))],
metadatas=[{"source": name, "chunk": j} for j in range(len(chunks))],
)
return len(chunks)
client = chromadb.PersistentClient(path="./chroma_db")
embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
col = client.get_or_create_collection("knowledge_base", embedding_function=embed_fn)
n = index_file("docs/readme.txt", col)
print(f"Indexed {n} chunks")Integration with Claude
Combine ChromaDB retrieval with Claude to answer questions grounded in your documents:
import anthropic
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient(path="./chroma_db")
embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
model_name="all-MiniLM-L6-v2"
)
col = client.get_collection("knowledge_base", embedding_function=embed_fn)
claude = anthropic.Anthropic()
def answer(question: str, n_results: int = 4) -> str:
chunks = col.query(query_texts=[question], n_results=n_results)["documents"][0]
context = "\n\n".join(chunks)
response = claude.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="Answer using ONLY the context. If not found, say so.",
messages=[
{
"role": "user",
"content": f"Context:\n{context}\n\nQuestion: {question}",
}
],
)
return response.content[0].text
print(answer("What are the main features of this product?"))Collections at Scale: Tips
- Choose chunk size carefully: 300–500 words works for most text; too small loses context, too large dilutes similarity
- Always persist: use
PersistentClientin production — in-memory data disappears on restart - Consistent embedding function: changing models means re-indexing everything
- Use metadata aggressively: filter by date, source, category before running similarity search to reduce noise
- Batch your adds:
collection.add()handles lists — don’t call it in a loop for thousands of docs - Monitor collection size:
collection.count()— ChromaDB handles millions of vectors comfortably on local disk
Running ChromaDB as a Server
For multi-process access or production deployments, run ChromaDB as an HTTP server:
chroma run --path ./chroma_db --port 8000Connect from Python:
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_or_create_collection("my_docs")Summary
ChromaDB gives you a full vector database in a single pip install:
- Create collections, add documents, query by semantic similarity
- Persist data to disk with
PersistentClient - Filter results with metadata using
where - Plug in any embedding model — local or API-based
- Combine with Claude to build RAG applications
Next steps: read the RAG Tutorial with Python to build a full retrieval pipeline, or the RAG Chatbot guide to add a conversational interface on top.