Vector Databases Comparison: Pinecone vs Chroma vs Weaviate (2026)

Choosing a vector database is one of the first decisions in any RAG project. Three options dominate developer usage: Pinecone, Chroma, and Weaviate. Here’s a practical comparison.

Quick Overview

	Pinecone	Chroma	Weaviate
Hosting	Managed cloud only	Local or cloud	Self-hosted or cloud
Setup time	5 minutes	1 minute	10–20 minutes
Free tier	Yes (1 index)	Yes (local)	Yes (self-hosted)
Best for	Production scale	Development, prototyping	Hybrid search, complex queries

Chroma: Start Here

Chroma runs in-process — no server needed. Perfect for development and small projects.

pip install chromadb

import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient("./chroma_db")
ef = embedding_functions.SentenceTransformerEmbeddingFunction(     model_name="all-MiniLM-L6-v2" )
collection = client.get_or_create_collection("docs", embedding_function=ef)
collection.add(     documents=[         "RAG stands for Retrieval-Augmented Generation.",         "Fine-tuning updates model weights on new data.",         "Vector search finds semantically similar content.",     ],     ids=["doc1", "doc2", "doc3"], )
results = collection.query(query_texts=["how does RAG work?"], n_results=2) print(results["documents"])

Pros: Zero setup, runs locally, great for prototyping. Cons: Not designed for multi-node scale, no built-in auth.

Pinecone: Managed Scale

Pinecone is a fully managed vector database. No infrastructure to run.

pip install pinecone-client sentence-transformers

from pinecone import Pinecone, ServerlessSpec
from sentence_transformers import SentenceTransformer
pc = Pinecone(api_key="YOUR_PINECONE_API_KEY")
pc.create_index(     name="kalyna-docs",     dimension=384,     metric="cosine",     spec=ServerlessSpec(cloud="aws", region="us-east-1"), )
index = pc.Index("kalyna-docs") model = SentenceTransformer("all-MiniLM-L6-v2")
texts = ["RAG is a retrieval technique.", "Fine-tuning changes model weights."] embeddings = model.encode(texts).tolist()
index.upsert(vectors=[     {"id": "v1", "values": embeddings[0], "metadata": {"text": texts[0]}},     {"id": "v2", "values": embeddings[1], "metadata": {"text": texts[1]}}, ])
query_vec = model.encode(["how to retrieve documents?"]).tolist()[0] results = index.query(vector=query_vec, top_k=2, include_metadata=True) for match in results.matches:     print(match.metadata["text"], "| score:", round(match.score, 3))

Pros: Zero ops, scales automatically, fast at large scale. Cons: Vendor lock-in, can get expensive, free tier limits.

Weaviate: Hybrid Search

Weaviate supports both vector search and keyword (BM25) search — called hybrid search.

pip install weaviate-client
docker run -p 8080:8080 cr.weaviate.io/semitechnologies/weaviate:latest

import weaviate
from weaviate.classes.config import Configure, Property, DataType
client = weaviate.connect_to_local()
client.collections.create(     "Document",     vectorizer_config=Configure.Vectorizer.text2vec_transformers(),     properties=[         Property(name="content", data_type=DataType.TEXT),         Property(name="source", data_type=DataType.TEXT),     ], )
collection = client.collections.get("Document") collection.data.insert({"content": "RAG retrieves documents at inference time.", "source": "guide"})
results = collection.query.hybrid(     query="retrieval augmented generation",     alpha=0.5,  # 0 = pure keyword, 1 = pure vector     limit=2, ) for obj in results.objects:     print(obj.properties["content"])
client.close()

Pros: Hybrid search, strong filtering, active community. Cons: More complex setup, steeper learning curve.

Which One to Pick?

Chroma — prototyping, demos, local development. Working in 10 minutes.

Pinecone — production, you don’t want to manage infrastructure. Pay for convenience.

Weaviate — hybrid search (semantic + keyword), complex filters, multi-tenant systems.

With Claude (RAG Example)

import anthropic
def rag_answer(question: str) -> str:     results = collection.query(query_texts=[question], n_results=3)     context = "\n".join(results["documents"][0])
client = anthropic.Anthropic()     response = client.messages.create(         model="claude-sonnet-4-6",         max_tokens=512,         messages=[{             "role": "user",             "content": f"Answer based on this context only:\n{context}\n\nQuestion: {question}"         }]     )     return response.content[0].text

This pattern works identically with Pinecone or Weaviate — just swap the retrieval step.

Originally published at kalyna.pro