llmd run_guidellm_benchmark

Runs a Guidellm benchmark job against the LLM inference service

Parameters

endpoint_url

  • Endpoint URL for the LLM inference service to benchmark

name

  • Name of the benchmark job

  • default value: guidellm-benchmark

namespace

  • Namespace to run the benchmark job in (empty string auto-detects current namespace)

image

  • Container image for the benchmark

  • default value: ghcr.io/vllm-project/guidellm

version

  • Version tag for the benchmark image

  • default value: v0.5.3

timeout

  • Timeout in seconds to wait for job completion

  • default value: 900

profile

  • Guidellm profile to use

  • default value: sweep

max_seconds

  • Maximum seconds to run benchmark

  • default value: 30

processor

  • Model processor name

  • default value: RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic

data

  • Data configuration

  • default value: prompt_tokens=256,output_tokens=128