Skip to content

Chroma Service

Chroma is an open-source vector database for AI applications.

Overview

Property Value
Type Vector Database
Default Port 8000
GPU Required No
Container docker://chromadb/chroma:latest

Quick Start

# Start Chroma service
python main.py --recipe recipes/services/chroma.yaml

# Run benchmark
python main.py --recipe recipes/clients/chroma_benchmark.yaml --target-service chroma_xxx

Recipe Configuration

# recipes/services/chroma.yaml
service:
  name: chroma
  description: "Chroma vector database"

  container:
    docker_source: docker://chromadb/chroma:latest
    image_path: $HOME/containers/chroma_latest.sif

  resources:
    nodes: 1
    ntasks: 1
    cpus_per_task: 4
    mem: "16G"
    time: "02:00:00"
    partition: cpu
    qos: default

  environment:
    CHROMA_SERVER_HOST: "0.0.0.0"
    CHROMA_SERVER_PORT: "8000"
    PERSIST_DIRECTORY: "/data"

  ports:
    - 8000

Benchmark Client

# recipes/clients/chroma_benchmark.yaml
client:
  name: chroma_benchmark
  type: chroma_benchmark

  parameters:
    collection_name: "benchmark_collection"
    num_documents: 10000
    embedding_dimension: 384
    num_queries: 1000
    top_k: 10
    output_file: "$HOME/results/chroma_benchmark.json"

Benchmark Operations

Operation Description
insert Add documents with embeddings
query Similarity search
update Modify existing documents
delete Remove documents

Metrics

Metric Description
insert_throughput Documents inserted per second
query_latency_avg Average query time
query_latency_p99 99th percentile query time
recall@k Search accuracy

API Examples

import chromadb

# Connect
client = chromadb.HttpClient(host="mel0058", port=8000)

# Create collection
collection = client.create_collection("my_collection")

# Add documents
collection.add(
    documents=["doc1", "doc2"],
    embeddings=[[0.1, 0.2, ...], [0.3, 0.4, ...]],
    ids=["id1", "id2"]
)

# Query
results = collection.query(
    query_embeddings=[[0.1, 0.2, ...]],
    n_results=5
)

See also: Services Overview