ChromaDB Tutorial: Build Your First Vector Database with Python (2026)

ChromaDB is an open-source vector database designed for AI applications. It stores embeddings locally, runs without a server, and integrates with any Python project in minutes. This tutorial covers everything from a first collection to metadata filtering and LLM integration.


What Is ChromaDB?

A vector database stores data as high-dimensional vectors (embeddings) and lets you search by semantic similarity — not exact keyword match. You can ask “what documents are about machine learning?” and get relevant results even if none of them contain that phrase.

ChromaDB stands out because:

  • No server needed — runs in-process or as a local server
  • Persistent storage — data survives restarts
  • Built-in embeddings — uses sentence-transformers out of the box
  • Simple API — add, query, update, delete in a few lines

Installation

pip install chromadb

For embedding support (optional, used by default embedding function):

pip install sentence-transformers

Your First Collection

A collection is a named group of vectors — like a table in a relational database. Here’s the full create → add → query cycle:

import chromadb

# In-memory client (data lost on restart)
client = chromadb.Client()

# Create a collection
collection = client.create_collection("my_docs")

# Add documents
collection.add(
    documents=[
        "Python is a high-level programming language.",
        "ChromaDB is an open-source vector database.",
        "Machine learning models require large datasets.",
        "Docker containers package apps with their dependencies.",
        "Claude is an AI assistant built by Anthropic.",
    ],
    ids=["doc1", "doc2", "doc3", "doc4", "doc5"],
)

# Query by semantic similarity
results = collection.query(
    query_texts=["What is a vector database?"],
    n_results=2,
)

for doc, dist in zip(results["documents"][0], results["distances"][0]):
    print(f"[{dist:.3f}] {doc}")

Output:

[0.312] ChromaDB is an open-source vector database.
[0.578] Python is a high-level programming language.

Persistent Storage

Use PersistentClient to keep data between runs. ChromaDB writes to disk automatically:

import chromadb

client = chromadb.PersistentClient(path="./chroma_db")

# Data persists across restarts
collection = client.get_or_create_collection("my_docs")
collection.add(
    documents=["This data will survive a restart."],
    ids=["persistent_doc1"],
)
print(f"Collection has {collection.count()} documents")

Embeddings: Default vs Custom

By default, ChromaDB uses all-MiniLM-L6-v2 from sentence-transformers. You can override it with any embedding function:

Sentence Transformers (local, free)

from chromadb.utils import embedding_functions

embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"  # fast, 384-dim
)

collection = client.get_or_create_collection(
    "docs",
    embedding_function=embed_fn,
)

OpenAI Embeddings

from chromadb.utils import embedding_functions

embed_fn = embedding_functions.OpenAIEmbeddingFunction(
    api_key="sk-...",
    model_name="text-embedding-3-small",
)

collection = client.get_or_create_collection(
    "docs_openai",
    embedding_function=embed_fn,
)

Use the same embedding function every time you open the same collection — mixing models breaks similarity search.


Metadata and Filtering

Each document can carry a metadata dict. Use it to filter results without scanning every vector:

collection.add(
    documents=[
        "GPT-4 is a large language model by OpenAI.",
        "Claude Sonnet is fast and affordable.",
        "Llama 3 is an open-source model by Meta.",
        "Gemini Pro is Google's flagship model.",
    ],
    ids=["gpt4", "claude", "llama3", "gemini"],
    metadatas=[
        {"company": "OpenAI",    "open_source": False},
        {"company": "Anthropic", "open_source": False},
        {"company": "Meta",      "open_source": True},
        {"company": "Google",    "open_source": False},
    ],
)

# Only search open-source models
results = collection.query(
    query_texts=["Which models are available for free?"],
    n_results=2,
    where={"open_source": True},
)
print(results["documents"][0])
# ['Llama 3 is an open-source model by Meta.']

Filter Operators

# Exact match
where={"company": "Anthropic"}

# Not equal
where={"company": {"$ne": "OpenAI"}}

# In a list
where={"company": {"$in": ["Anthropic", "Meta"]}}

# Combine with $and / $or
where={"$and": [{"open_source": True}, {"company": {"$ne": "Meta"}}]}

Getting, Updating, and Deleting

# Get by ID
result = collection.get(ids=["doc1", "doc2"])
print(result["documents"])

# Update a document (replaces embedding + metadata)
collection.update(
    ids=["doc1"],
    documents=["Python is a versatile, high-level programming language."],
    metadatas=[{"updated": True}],
)

# Delete by ID
collection.delete(ids=["doc4"])

# Delete by metadata filter
collection.delete(where={"company": "OpenAI"})

# Count remaining
print(f"Documents: {collection.count()}")

Full-Text Search on Metadata

ChromaDB also supports exact-match filtering on document text via where_document:

results = collection.query(
    query_texts=["AI assistant"],
    n_results=3,
    where_document={"$contains": "Anthropic"},
)

Working with Real Documents

A practical loader that chunks text files and PDFs before indexing:

from pathlib import Path
import chromadb
from chromadb.utils import embedding_functions


def chunk_text(text: str, size: int = 400, overlap: int = 40) -> list[str]:
    words = text.split()
    chunks, i = [], 0
    while i < len(words):
        chunks.append(" ".join(words[i : i + size]))
        i += size - overlap
    return chunks


def index_file(path: str, collection) -> int:
    text = Path(path).read_text(encoding="utf-8")
    chunks = chunk_text(text)
    name = Path(path).name
    collection.add(
        documents=chunks,
        ids=[f"{name}_chunk_{j}" for j in range(len(chunks))],
        metadatas=[{"source": name, "chunk": j} for j in range(len(chunks))],
    )
    return len(chunks)


client = chromadb.PersistentClient(path="./chroma_db")
embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)
col = client.get_or_create_collection("knowledge_base", embedding_function=embed_fn)

n = index_file("docs/readme.txt", col)
print(f"Indexed {n} chunks")

Integration with Claude

Combine ChromaDB retrieval with Claude to answer questions grounded in your documents:

import anthropic
import chromadb
from chromadb.utils import embedding_functions

client = chromadb.PersistentClient(path="./chroma_db")
embed_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)
col = client.get_collection("knowledge_base", embedding_function=embed_fn)
claude = anthropic.Anthropic()


def answer(question: str, n_results: int = 4) -> str:
    chunks = col.query(query_texts=[question], n_results=n_results)["documents"][0]
    context = "\n\n".join(chunks)

    response = claude.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="Answer using ONLY the context. If not found, say so.",
        messages=[
            {
                "role": "user",
                "content": f"Context:\n{context}\n\nQuestion: {question}",
            }
        ],
    )
    return response.content[0].text


print(answer("What are the main features of this product?"))

Collections at Scale: Tips

  • Choose chunk size carefully: 300–500 words works for most text; too small loses context, too large dilutes similarity
  • Always persist: use PersistentClient in production — in-memory data disappears on restart
  • Consistent embedding function: changing models means re-indexing everything
  • Use metadata aggressively: filter by date, source, category before running similarity search to reduce noise
  • Batch your adds: collection.add() handles lists — don’t call it in a loop for thousands of docs
  • Monitor collection size: collection.count() — ChromaDB handles millions of vectors comfortably on local disk

Running ChromaDB as a Server

For multi-process access or production deployments, run ChromaDB as an HTTP server:

chroma run --path ./chroma_db --port 8000

Connect from Python:

import chromadb

client = chromadb.HttpClient(host="localhost", port=8000)
collection = client.get_or_create_collection("my_docs")

Summary

ChromaDB gives you a full vector database in a single pip install:

  • Create collections, add documents, query by semantic similarity
  • Persist data to disk with PersistentClient
  • Filter results with metadata using where
  • Plug in any embedding model — local or API-based
  • Combine with Claude to build RAG applications

Next steps: read the RAG Tutorial with Python to build a full retrieval pipeline, or the RAG Chatbot guide to add a conversational interface on top.