Skip to content

HPC AI Benchmarking Orchestrator

Welcome to the HPC AI Benchmarking Orchestrator documentation.

Overview

A modular Python orchestrator for running containerized AI benchmarking workloads on HPC clusters via SLURM. This framework enables automated deployment, benchmarking, and monitoring of AI services including LLM inference servers, vector databases, in-memory stores, and relational databases on the MeluXina supercomputer.

Architecture

flowchart TB
    subgraph Local["Local Machine"]
        direction LR
        CLI[main.py] ~~~ Config[config.yaml]
    end

    subgraph Framework["Orchestrator Framework"]
        direction TB
        Orch[Orchestrator]
        subgraph Modules["Modules"]
            direction LR
            Srv[Servers] ~~~ Cli[Clients] ~~~ Mon[Monitors]
        end
        Orch --> Modules
        Modules --> SSH[SSHClient]
    end

    subgraph HPC["MeluXina HPC"]
        direction TB
        SLURM[SLURM] --> Compute[Compute] --> Containers[Apptainer]
    end

    subgraph Services["Deployed Services"]
        direction LR
        Ollama[Ollama] ~~~ Redis[Redis] ~~~ Chroma[Chroma] ~~~ MySQL[MySQL]
    end

    subgraph Monitoring["Monitoring Stack"]
        direction LR
        cAdvisor[cAdvisor] --> Prometheus[Prometheus] --> Grafana[Grafana]
    end

    Local --> Framework
    Framework --> HPC
    HPC --> Services
    HPC --> Monitoring

    style CLI fill:#1976D2,color:#fff
    style Config fill:#1976D2,color:#fff
    style Orch fill:#388E3C,color:#fff
    style Srv fill:#388E3C,color:#fff
    style Cli fill:#388E3C,color:#fff
    style Mon fill:#388E3C,color:#fff
    style SSH fill:#455A64,color:#fff
    style SLURM fill:#F57C00,color:#fff
    style Compute fill:#F57C00,color:#fff
    style Containers fill:#F57C00,color:#fff
    style Ollama fill:#0288D1,color:#fff
    style Redis fill:#D32F2F,color:#fff
    style Chroma fill:#689F38,color:#fff
    style MySQL fill:#1565C0,color:#fff
    style cAdvisor fill:#7B1FA2,color:#fff
    style Prometheus fill:#E64A19,color:#fff
    style Grafana fill:#F57C00,color:#fff

Supported Services

Service Type Port Description
Ollama LLM Inference 11434 High-performance LLM inference server
Redis In-Memory DB 6379 Key-value store with persistence
Chroma Vector DB 8000 Vector similarity search
MySQL RDBMS 3306 Relational database
Prometheus Monitoring 9090 Metrics collection
Grafana Visualization 3000 Real-time dashboards

Quick Start

# Install dependencies
pip install -r requirements.txt

# Start an Ollama service
python main.py --recipe recipes/services/ollama.yaml

# Check status
python main.py --status

# Run benchmark client
python main.py --recipe recipes/clients/ollama_benchmark.yaml --target-service <SERVICE_ID>

# View results in Grafana (after SSH tunnel)
open http://localhost:3000

Documentation

🚀 Getting Started Setup, installation, and your first benchmark
🏗 Architecture System design, components, and data flow
⚙ Services Ollama, Redis, Chroma, MySQL configuration
💻 CLI Reference Complete command-line interface documentation
📄 Recipes YAML configuration for services and clients
📈 Monitoring Grafana dashboards and Prometheus metrics

Key Features

  • YAML-based Configuration: Define services and benchmarks declaratively
  • Multi-Service Support: Ollama, Redis, Chroma, MySQL, and more
  • Automated SLURM Integration: Seamless job submission and management
  • Real-time Monitoring: Grafana dashboards with cAdvisor metrics
  • Parametric Benchmarking: Sweep across multiple configurations
  • SSH Tunneling: Secure access to HPC services

Project Status

Version: 1.0.0
Last Updated: January 2026
Project: EUMaster4HPC Challenge 2025-2026
Supervisor: Dr. Farouk Mansouri
Platform: MeluXina Supercomputer


Built for the Software Atelier course in collaboration with EUMASTER4HPC and LuxProvide